www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1995/08/16/07:31:50

From: "A.Appleyard" <A DOT APPLEYARD AT fs2 DOT mt DOT umist DOT ac DOT uk>
To: mat AT ardi DOT com (Mat Hostetter), djgpp AT sun DOT soe DOT clarkson DOT edu
Date: Wed, 16 Aug 1995 09:16:22 BST
Subject: Re: inline asm?

  A.Appleyard wrote:-
> I use these two macros as a pair thus, e.g.:-
>   _ax=0x1234; etc; __SR(); asm("this and that"); __RR(); use_value_of(_ax);
> The original registers are saved by __SR() and restored by __RR(), and I do
> explicitly alter _ax in C text.

  mat AT ardi DOT com (Mat Hostetter) replied:-
> ... not a legitimate way to write it (although I believe you that it will
> work with the current gcc release). ... the problem is that the asm secretly
> alters _ax, and also requires its value be legitimate on input. ... If you
> can give me a small example of some __SR/__RR asm of this form that you're
> using I can show you how I'd rewrite it.

  When I call interrupts. This method below works OK for me, and seems to take
less instructions than the official version, and on at least one occasion
worked when the official version (int86() or whatever it is called) jammed up.

long _ax,_bx,_cx,_dx,_si,_di,_bp,_es; short _flags;
#define _carry (_flags&1)
#define _zeroflag (_flags&0x40)
#define __SR() /* save the registers */           ({asm("xchgl %eax,__ax"); \
    asm("xchgl %ebx,__bx"); asm("xchgl %ecx,__cx"); asm("xchgl %edx,__dx"); \
    asm("xchgl %esi,__si"); asm("xchgl %edi,__di"); asm("xchgl %ebp,__bp"); })
#define __RR() /* restore the registers */      ({ \
    asm("pushf"); asm("popw __flags"); __SR();})
void int10() {__SR(); asm("int $0x10"); __RR();}
void int21() {__SR(); asm("int $0x21"); __RR();} /* etc for each interrupt */
typedef unsigned char byte;
byte get__key(){_ax=0x700; int21(); return _ax&255;}
int get_key(){return get__key()?:-get__key();}

> I'm confident xchgl is more than a factor of two slower than movl (on the
> Pentium!), and of course it's not pairable on the Pentium at all.
    (a) asm("xchgl %eax,__ax");
    (b) asm("xorl %eax,__ax"); asm("xorl __ax,%eax"); asm("xorl %eax,__ax");
    (c) asm("movl %eax,XXX"); asm("movl __ax,%eax"); asm("movl XXX,__ax");
        (where XXX is some spare 32-bit register)
  (a) and (b) and (c) here all seem to have the same effect. At first I used
(b), but I changed to (a) because it has less instructions. Which of these
three versions is fastest to run, and by how much? I suppose that such
slowness matters according to how often it is called; with me the main use is
when I read a character from the keyboard.

> but I was timing this stuff with the Pentium's rdtsc instruction so
  What does rdtsc do?
  Where can I get a full list of available assembly instrictions on all PC
versions including the Pentium?

  What much of this may prove to boil down to is this: How can I tell djgpp's
optimizer to keep its paws off a particular short part of my program?, so that
the instructions in that part are compiled as I wrote them without being
shuffled, or any of them omitted, or instructions from elsewhere interpolated.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019