To: djgpp AT delorie DOT com
Subject: Re: DJGPP Optimizing
Message-ID: <19970101.092917.7735.0.chambersb@juno.com>
References: <19961231 DOT 163855 DOT 4983 DOT 3 DOT chambersb AT juno DOT com>
From: chambersb AT juno DOT com (Benjamin D Chambers)
Date: Tue, 31 Dec 1996 12:24:21 EST


On 31 Dec 1996 10:03:11 GMT semler@@delorie.com writes:
>In <19961231 DOT 163855 DOT 4983 DOT 3 DOT chambersb AT juno DOT com>, chambersb AT juno DOT com 
>(Benjamin D Chambers) writes:
>
>Ok thanks for the hints but I'm still having troubles:
>
>1) The inner loop I talked about is
>
>unsigned long i,run,u0,v0,du,dv,mask;
>long *b1,b2;
>
>for(i=run;i--) {
Sorry, I don't fully understand the behavior of for with only two
arguments (a drawback of learning as I go, rather than with books I
don't(?) need), but how does it act with only two arguments?  The way I'm
used to it would write the above as:
for(i=run;i>0;i--)
          ^^^ (That's what you're going for, isn't it?)

>  u0 = (u0+du)&mask;
>  v0 = (v0+dv)&mask;
> *b1++ = b2[((((unsigned long)v0)>>16)<<7)+((((unsigned 
>long)u0)>>16))];
>}
>
>This compiles to:
>
>L1220:
>	fxch %st(2)
>	fxch %st(4)
>	fxch %st(5)
>	fxch %st(3)
>	fxch %st(6)
>	fxch %st(2)
>	fxch %st(2)
>	fxch %st(6)
>	fxch %st(3)
>	fxch %st(5)
>	fxch %st(4)
>	fxch %st(2)
>	movl -132(%ebp),%eax
>	addl -232(%ebp),%eax
>	movl -16(%ebp),%edx
>	andl %edx,%eax
>	movl %eax,-232(%ebp)
>	movl -136(%ebp),%eax
>	addl %ebx,%eax
>	movl %eax,%ebx
>	andl %edx,%ebx
>	movl %ebx,%eax
>	shrl $16,%eax
>	sall $7,%eax
>	movl -232(%ebp),%edx
>	shrl $16,%edx
>	addl %edx,%eax
>	movl -120(%ebp),%esi
>	movl (%esi,%eax,4),%eax
>	movl -116(%ebp),%edi
>	movl %eax,(%edi)
>	addl $4,%edi
>	movl %edi,-116(%ebp)
>L1209:
>	decl %ecx
>	cmpl $-1,%ecx
>	jne L1220
>
>And the first 12 instructions are here in question... What do they do? 
>Why are
>they here?
I have no idea what they are - my reference doesn't list the FPU
instruction set :)

>
>2) I know that the intel processors has troubles with domains, but is 
>there no
>standard behaviour for the Gnu C about what to expect on this matter?
Usually, GCC is pretty good about producing reliable code.  I suppose
there could be something in your code elsewhere that would cause GCC to
do this, but I can't think of what.  Maybe you've stumbled on a compiler
bug - try to see if GCC always does this (or at least, under similar
conditions).
Since you know a bit about ASM (else you wouldn't be looking at the asm
output) you might just want to code the inner loop in asm on your own,
rather than letting GCC do it.
Beyond that, I'm stumped.

...Chambers
>
>Kind regards and a Happy New Year!
>Henning Semler
>
Ditto.