From: "Alexei A. Frounze" Newsgroups: comp.os.msdos.djgpp Subject: Re: inefficiency of GCC output code & -O problem Date: Tue, 18 Apr 2000 04:47:58 +0400 Organization: MTU-Intel ISP Lines: 38 Message-ID: <38FBB0BE.398F2E92@mtu-net.ru> References: <38F9D717 DOT 9438A3F6 AT mtu-net DOT ru> <8df84a DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB4094 DOT DE7B5F4C AT mtu-net DOT ru> <8dfum2 DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB7858 DOT 41B090DB AT mtu-net DOT ru> NNTP-Posting-Host: ppp97-207.dialup.mtu-net.ru Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit X-Trace: gavrilo.mtu.ru 956018945 9232 212.188.97.207 (18 Apr 2000 00:49:05 GMT) X-Complaints-To: usenet-abuse AT mtu DOT ru NNTP-Posting-Date: 18 Apr 2000 00:49:05 GMT Cc: buers AT gmx DOT de X-Mailer: Mozilla 4.72 [en] (Win95; I) X-Accept-Language: en,ru To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com "Alexei A. Frounze" wrote: > Well, let me tell some words in conclusion. ;) > > 1. You simply proved that GCC has an optimizer efficient enough. Okay, I agree. > Your code that works 2 FPS fater for you works the same for me as before. I > think it doesn't mean faster than mine (just 2.9%). > So, we have a good optimizer and you proved this. Great. I'm glad. This means I > can throw away a lot of inline ASM now. > > 2. If I knew that (int)(x) is slow and if I had proper manual on inline ASM, I > would achived the same but with less problems. > > 3. Dieter, I hope you won't try to convert span() to plane C. :) > This replacement doesn't work even nearly fast: > --------------8<---------------- > while (n--) { > *scr++ = *(texture+((v1>>8)&0xFF00)+((u1>>16)&0xFF)); > u1 += du; > v1 += dv; > }; > --------------8<---------------- > > Anyway thank you. And btw, thank to myself. If I didn't write efficient C code > between /* */ :), Dieter would never prove that GCC has a good optimizer because > he doesn't know the tmapping algorithm (do you?). And btw, the main trick: 4. The FIDIVRL instruction is executed in parallel with span(). If we remove this inline code and put C instead, we'll loose performance. It's also a proof that my inline asm is not a redundant thing. :) bye. Alexei A. Frounze ----------------------------------------- Homepage: http://alexfru.chat.ru Mirror: http://members.xoom.com/alexfru