www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/05/27/04:33:30

From: ellman AT xs4all DOT nl ()
Newsgroups: comp.os.msdos.djgpp,comp.lang.asm.x86,alt.msdos.programmer
Subject: Re: Making a selector - dmpi stuff
Date: 26 May 1997 20:19:06 GMT
Organization: XS4ALL
Lines: 71
Message-ID: <5mcr7q$sk8$1@news0.xs4all.nl>
References: <3389c924 DOT 198828 AT nntp DOT netcomuk DOT co DOT uk>
NNTP-Posting-Host: xs1.xs4all.nl
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

In article <3389c924 DOT 198828 AT nntp DOT netcomuk DOT co DOT uk>,
William McGugan <wmcgugan AT netcomuk DOT co DOT uk> wrote:
>
>I want to be able to do this so I can refer to memory like so:
>
>mov	al, [fs:ebx]
>
>rather than :
>
>mov	al, [esi+ebx]
>
>So I can free esi (and edi) for other tasks.

I believe that on Pentiums, the extra segment/selector prefix involved makes
it not possible to pipeline two instructions (although I think you can
pipeline an instructuion with a prefix on the MMX Pentiums if you arrange
your instructions so that prefixed instructions go through the u or v pipe
(I can't remember which)).

A technique you can use when in protected mode (where CS==DS) is self-modifying
code. This is used when a variable remains constant inside a loop that gets
iterated lots of times. As "More Tricks of the Game Programming Gurus" says:
"It's like creating registers out of thin air"

For example, assuming that %esi remains constant in the inner loop, you can
do the following optimisation:

loop:
[...]
movb (%esi,%ebx), %al 
[...]

becomes:

movl %esi, REWRITE_LOCATION-4
/* save %esi in memory if necessary */
loop:
[...]
movb 0x12345678(%ebx), %al
REWRITE_LOCATION:
/* You can have a shorter 'dummy' constant if you know %esi only contains
byte or word values */
[...]
end_of_loop:
/* restore %esi if necessary */

Both programs function the same, and %esi is now free inside the inner loop.

If this enables you to store all variables that are to be used inside the loop
in registers, you can manipulate large chunks of data without worrying that
you may invalidate the part of the cache that stores the non-register variables.

However, I only reccommend this technique when there's a shortage of registers,
as instructuions with a constant displacement are longer than those that
just use registers.

Does anyone know what efffect this has on Pentiums where there's a seperate
code and data cache? (It works fine on my Pentium). Does the entire code cache
become invalid, or just those cache lines with bytes that have changed?

For me, this seems to work fine in assembly object files assembled with gas,
and linked to DJGPP C programs.

AE.

--
Andrei Ellman -- URL: http://www.xs4all.nl/~ellman/ae-a -- ae1 AT york DOT ac DOT uk
"All I wanna do is have some fun     :-)      ||  ae-a AT minster DOT york DOT ac DOT uk
 I've got the feeling I'm not the only one"   ||  mailto:ellman AT xs4all DOT nl
     -- Sheryl Crow      :-)    ||       It's what you make of it. 

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019