www.delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1998/07/13/22:56:17

X-pop3-spooler: POP3MAIL 2.1.0 b 4 980420 -bs-
Message-ID: <19980714004206.16665@cerebro.laendle>
Date: Tue, 14 Jul 1998 00:42:06 +0200
From: Marc Lehmann <pcg AT goof DOT com>
To: Misha <vulcao AT netvision DOT net DOT il>
Cc: beastium <beastium-list AT Desk DOT nl>
Subject: Re: PGCC's lack of optimizations... (slightly lengthy)
Mail-Followup-To: Misha <vulcao AT netvision DOT net DOT il>,
beastium <beastium-list AT desk DOT nl>
References: <35A9E060 DOT 34A50938 AT netvision DOT net DOT il>
Mime-Version: 1.0
In-Reply-To: <35A9E060.34A50938@netvision.net.il>; from Misha on Mon, Jul 13, 1998 at 01:24:32PM +0300
X-Operating-System: Linux version 2.1.108 (root AT cerebro) (gcc version pgcc-2.91.43 19980628 (gcc2 ss-980502 experimental))
Status: RO
Lines: 77

On Mon, Jul 13, 1998 at 01:24:32PM +0300, Misha wrote:
> 
> I am trying to compile some  number-crunching stuff on my Linux
> (PentiumII). I have both gcc-2.7.2.1 and pgcc-1.0.3.
> The point is that pgcc produces consistently WORSE code than gcc-2.7.2.1
> on both floating point and integer issues.
> In all cases it produces code that is approx. 5% to 25% slower on the PentiumII.
> I have read the entire pgcc documentation, so I believe I use all the appropriate

I guess the number crunching code is fpu-intensive? in that case, the double alignment
is absolutely essential, otherwise performance is absolutely random and
might well be much slower than with gcc.

which libc are you using? if you use libc5 or an earlier version of glibc2.0.6,
consider upgrading.

Have you used the -malign-double flag? (and maybe -mstack-align-double)?
without these flags, fp-performance is random as well. (newer snapshots
align many static variables automatically, so it might be worth to give them
a try)

you might also want to try -mpentiumpro and -march=pentiumpro to ensure
egcs/pgcc actually produces code for your cpu.

(a final tip, independent of this issue, it often helps to -funroll-all-loops
and/or -fschedule-insns)

The reasons for all this is that the default x86 ABI specifies a highly
suboptimal alignment for doubles. Changing this alignment breaks the ABI and
_might_ require that you re-compile all code including all libraries you use,
so thsi can't be on by default.

> I can't send you the code, but I can tell you that it is some sort of a DSP-kind

if you are sure you have the correct alignment, and the problem still persists,
I'd really like to get this problem fixed.

> It is a bit sad that the compiler that produces i486 code,  produces better code than
> the compiler that produces Pentium code. I still hope I might doing something wrong...

its an interesting question who actually made an error. at the time the x86 ABI was
created, double alignment was not a problem. With modern cpus (pentium and above)
it is.

> 1.  Is the problem known?

probably.

> 2.  Are there any tools like SGI's "perfex"  available for Linux?
>      The "perfex" tool executes the code and then reports the statistics from the
>      CPU internal event counters, so you have a picture of, say, how many L1 and L2
>      cache misses were, the FPU unit utilization, mispredicted branches, etc...

there are various patches floating around that make use of the performance
monitoring registers under linux, but I do not have a pointer ;(

> maximum optimization option, it produces the assembly code, but it also
> places some statistics on the success of optimizations in the code! For
> instance, in tight loops it gives you software pipelining success,
> parallelization success and CPU unit utilization in %.

If gcc only had this information itself... doing this for x86 is much more
difficult than for a sane architecture. The performance of code on pentiums
depends on such things as stepping or # of bugs fixed :()

> such a tool be available whether as a part of pgcc or otherwise. I would

Hmm, that would be way cool, yet I do not know what this might be good for,
except for tracking down bugs or similar tasks ;)

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg AT goof DOT com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019