Date: Wed, 17 Jun 1998 09:12:15 +0300 (IDT)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
To: "Salvador Eduardo Tropea (SET)" <salvador AT inti DOT gov DOT ar>
cc: djgpp AT delorie DOT com, lubaldo AT adinet DOT com DOT uy
Subject: Re: 64k demo
In-Reply-To: <m0ylxKP-000S41C@inti.gov.ar>
Message-ID: <Pine.SUN.3.91.980617090437.18792E-100000@is>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk


On Tue, 16 Jun 1998, Salvador Eduardo Tropea (SET) wrote:

> > At the very least, you should look at the 
> > code emitted by the compiler (gcc -S) and show that it indeed generates 
> > different code in these cases, and that the differences can indeed 
> > justify the speed variation you observed.
> 
> Yes that's the best thing but Ivan doesn't know assembler so isn't a big help 
> for he.

Then somebody else should do that.  I have found through hard experience 
that runtime behavior of any non-trivial code is far from being obvious.  
You need to look at the machine instructions to understand why two 
different versions run at different speeds, and even then it is not 
always clear, with all the complexities of code/data cache, secondary 
cache, non-cacheable memory regions, etc.  Theory alone is *certainly* not 
enough.

> Intel processors have only few registers, that's too sad but is the fact. Small 
> loops sometimes uses ALL the registers so 1 more register can do a BIG 
> difference in performance. Using static arrays you save registers.

This is theory.  How, if at all, it applies to the case in point, remains 
to be shown.  I have spent too much time trying to explain away 
performance differences between semantically identical programs, and lost 
all faith in such theoretical arguments in the process.

The least we should do is to look at the code produced by the compiler. 
Even if it does use an additional register, that doesn't necessarily 
explain the performance difference.