From: buers AT gmx DOT de (Dieter Buerssner) Newsgroups: comp.os.msdos.djgpp Subject: Re: inefficiency of GCC output code & -O problem Date: 14 Apr 2000 14:47:50 GMT Lines: 62 Message-ID: <8d7i78.3vvqipv.0@buerssner-17104.user.cis.dfn.de> References: <38F20E7A DOT 3330E9A4 AT mtu-net DOT ru> <38F23A21 DOT A59621A1 AT inti DOT gov DOT ar> <38F49A45 DOT 13F0AB1 AT mtu-net DOT ru> <8d4ca1 DOT 3vvqqup DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38F60DB3 DOT E355975 AT mtu-net DOT ru> <8d5ljq DOT 3vvqipv DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38F6A29B DOT 3AAEC0E AT mtu-net DOT ru> NNTP-Posting-Host: pec-107-11.tnt6.s2.uunet.de (149.225.107.11) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: fu-berlin.de 955723670 7804338 149.225.107.11 (16 [17104]) X-Posting-Agent: Hamster/1.3.13.0 User-Agent: Xnews/03.02.04 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Alexei A. Frounze wrote: >Dieter Buerssner wrote: >> >> Alexei A. Frounze wrote: >> >> I have not said, that the plain C code would be faster or slower. >> I just asked a question, that may be not to difficult to answer >> for you, because the C code is already there, in comments. Note, that I was refering to the T_Map function here, that does not need to fiddle with the FPU control word. >Not really. Actually, my ASM code improves the performance greatly. And Can you quantify this. >I can't copmare ASM vs optimized plain C because GCC/AS doesn't compile >the source with -O2 switch. Why not? With all the usages of the adress off operater "&", gcc won't be able to optimize your code very much. If you use &var, usually var cannot be kept in a register. I have shown you one example, where this can be clearly seen. Also, the inner loops seem to be almost totally written in inline assembler, so you can't expect much optimization anyway. To "fix" your code temporarily, you could try to replace all "g", with "r", and add "memory" to the clobber list. But of course, you shouldn't do this blindly. Then you may be able to compile it with -O2. Post the results. >> Not here. When compiling your code with gcc -O -S (gcc 2.95.2), >> for the interesting lines >> So, this happens to produce correct code, even if the inline >> assembly is wrong, as I explained in another post. After looking at my own post, I am not sure, whether this is correct. There are two occurences of the code snippet in the source. I looked at the assembler output of one, that seemed correct. I pasted the other, and that seems incorrect. Anyway, the C code produces better assembly, as your inline code with the references could ever do. >But -O is not enough for me. ;) I still need -O2. Then try it yourself with -O2, it does take less than a minute, and you will see no significant difference. As you have noted, gcc may have optimization bugs. (It for sure has, I have found at least one in the current verion.) But throwing buggy code at gcc and then complaining, is hardly a mean to prove this. One final suggestion: Stay away from inline assembly (yet). I have seen some snippets you have posted (and mailed), and almost all of them were clearly wrong. To cite Donald E. Knuth from memory: "Premature optimization is the devil of all programming." -- Regards, Dieter