From: ellman AT xs4all DOT nl () Newsgroups: comp.os.msdos.djgpp,comp.lang.asm.x86,alt.msdos.programmer Subject: Re: Making a selector - dmpi stuff Date: 26 May 1997 20:19:06 GMT Organization: XS4ALL Lines: 71 Message-ID: <5mcr7q$sk8$1@news0.xs4all.nl> References: <3389c924 DOT 198828 AT nntp DOT netcomuk DOT co DOT uk> NNTP-Posting-Host: xs1.xs4all.nl To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Precedence: bulk In article <3389c924 DOT 198828 AT nntp DOT netcomuk DOT co DOT uk>, William McGugan wrote: > >I want to be able to do this so I can refer to memory like so: > >mov al, [fs:ebx] > >rather than : > >mov al, [esi+ebx] > >So I can free esi (and edi) for other tasks. I believe that on Pentiums, the extra segment/selector prefix involved makes it not possible to pipeline two instructions (although I think you can pipeline an instructuion with a prefix on the MMX Pentiums if you arrange your instructions so that prefixed instructions go through the u or v pipe (I can't remember which)). A technique you can use when in protected mode (where CS==DS) is self-modifying code. This is used when a variable remains constant inside a loop that gets iterated lots of times. As "More Tricks of the Game Programming Gurus" says: "It's like creating registers out of thin air" For example, assuming that %esi remains constant in the inner loop, you can do the following optimisation: loop: [...] movb (%esi,%ebx), %al [...] becomes: movl %esi, REWRITE_LOCATION-4 /* save %esi in memory if necessary */ loop: [...] movb 0x12345678(%ebx), %al REWRITE_LOCATION: /* You can have a shorter 'dummy' constant if you know %esi only contains byte or word values */ [...] end_of_loop: /* restore %esi if necessary */ Both programs function the same, and %esi is now free inside the inner loop. If this enables you to store all variables that are to be used inside the loop in registers, you can manipulate large chunks of data without worrying that you may invalidate the part of the cache that stores the non-register variables. However, I only reccommend this technique when there's a shortage of registers, as instructuions with a constant displacement are longer than those that just use registers. Does anyone know what efffect this has on Pentiums where there's a seperate code and data cache? (It works fine on my Pentium). Does the entire code cache become invalid, or just those cache lines with bytes that have changed? For me, this seems to work fine in assembly object files assembled with gas, and linked to DJGPP C programs. AE. -- Andrei Ellman -- URL: http://www.xs4all.nl/~ellman/ae-a -- ae1 AT york DOT ac DOT uk "All I wanna do is have some fun :-) || ae-a AT minster DOT york DOT ac DOT uk I've got the feeling I'm not the only one" || mailto:ellman AT xs4all DOT nl -- Sheryl Crow :-) || It's what you make of it.