From: "Alexei A. Frounze" Newsgroups: comp.os.msdos.djgpp Subject: Re: inefficiency of GCC output code & -O problem Date: Mon, 10 Apr 2000 23:37:10 +0400 Organization: MTU-Intel ISP Lines: 61 Message-ID: <38F22D66.42D78282@mtu-net.ru> References: <38F20E7A DOT 3330E9A4 AT mtu-net DOT ru> <38F2250B DOT 1DC270D5 AT maths DOT unine DOT ch> NNTP-Posting-Host: ppp102-126.dialup.mtu-net.ru Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit X-Trace: gavrilo.mtu.ru 955395489 73052 212.188.102.126 (10 Apr 2000 19:38:09 GMT) X-Complaints-To: usenet-abuse AT mtu DOT ru NNTP-Posting-Date: 10 Apr 2000 19:38:09 GMT X-Mailer: Mozilla 4.61 [en] (Win95; I) X-Accept-Language: en,ru To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Gautier wrote: > > "Alexei A. Frounze": > > > Why GCC output too much redundant code? > > With which optimisation level do you mean that ? I thought it should be w/o specifying some optimization level... See below. > > > I mean, it always put values to the CPU registers, although it's possible to > > make the same operation w/o taking registers? > > If you observed that with -O2, it is an optimisation strategy: > when the most data are loaded in registers, the program has more chances to > run at a speed that resembles the processor speed than at main board speed (roughly said...). > It means to have less data to communicate with RAM, which takes big time > resources. Well, but it work with registers even when value needs to be zeroed, shifted and so on. It loads registers in very simple situations, where possible to adjust some bytes and we're done. > On Intel x86s there is not much to do - there are so few registers - but > anyway GCC is very smart at register mapping ! Not a few. That's enough. Btw, the same 3d engine runs around 12% faster if compiled with Watcom C than compiled with GCC (-O2, no stack frames, etc.etc). Btw, Watcom C also can produce (w/o special optimize swtches) an inner loop of tmapper as fast as assembly subroutine. I tested - there is no any noticeable difference between pure C and some ASM. :) I don't think GCC is a very good optimizing compiler. Anyway it has something over other compilers: it's free, it's portble and multiplatform. > On processors with more recent design like Alpha or even the Motorola 68K, optimisers > try to exploit at best the 36 (16) registers and the result is beautiful. > > > Also why GCC does type cast of byte/word <-> dword values so awful? It allocates > > some extra bytes on the stack, put values there and get them back... > > On 32-bit x86 it is better to use only 8 of 32 bits than allocating and manipulating > 1 byte... But maybe you just hit a weak typing problem (I mean GCC hesitates about > the precise types, the sizes and so on) ? It always works this way. So I forced my functions to have only 32-bit parameters and return values. This solves this ugly problem. > You should try an equivalent test on a very > strong typed GCC front-end like GNAT... What's GNAT? bye. Alexei A. Frounze ----------------------------------------- Homepage: http://alexfru.chat.ru Mirror: http://members.xoom.com/alexfru