From: buers AT gmx DOT de (Dieter Buerssner)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: inefficiency of GCC output code & -O problem
Date: 15 Apr 2000 17:10:07 GMT
Lines: 49
Message-ID: <8daeu9.3vs7iub.0@buerssner-17104.user.cis.dfn.de>
References: <38F20E7A DOT 3330E9A4 AT mtu-net DOT ru> <38F23A21 DOT A59621A1 AT inti DOT gov DOT ar> <38F49A45 DOT 13F0AB1 AT mtu-net DOT ru>
NNTP-Posting-Host: pec-142-223.tnt9.s2.uunet.de (149.225.142.223)
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: fu-berlin.de 955818607 7786722 149.225.142.223 (16 [17104])
X-Posting-Agent: Hamster/1.3.13.0
User-Agent: Xnews/03.02.04
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Alexei A. Frounze wrote:

>Here goes a part of my project. I simply removed as much code as 
>needed to leave it along.

I already replied to the same post, but after rethinking, I missed
a subtle and perhaps important point.

I do not know of any way, to tell gcc, how many registers on the
floating point stack you need. And this seems really to make
reliable inline assembly with floating point very difficult.
The reason is, that when not everything is written inline (you
have written almost all floating point in assembly), gcc might
want to leave some values in the floating point stack, when it
thinks, it can reuse them later. I cannot think of a way, how one 
can see this from the source in general.

{
  double x, y;
  /* calc x somehow (1) */
  /* assembler, uses all 8 FPU registers */
  /* calc y */
  x *= y; /* x is reused here, so gcc may keep it in st0 at (1) */
}

So, you might get a stack overflow.

It could also be, that using __asm__ volatile, together with
"memory" in the clobber list, will prevent gcc from considering
to cache those floating point values in the FPU, but I don't
know, and I can't remember seeing this mentioned anywhere.
You might consider to investigate this. Hopefully some readers
of this group do know more, or can suggest where the information
can be found.

There are "f" (floating point register), "t" (top of stack, st0) and
"u" (st1) constraints. But it seems difficult to work with those,
because the different registers "move" on the floating point 
stack, when you use the floating point upcodes. So this is really
is much more complicated, then working with "normal" registers.

Regards, Dieter

P.S: Those semicolons after the if { }; blocks are not needed, and
may even become dangerous, when you need to change the source.

P.P.S: I have compiled parts of your code with the uncommented C code
instead of the inline assembly, and gcc -O seems to produce quite
efficient code.