X-POP3-Rcpt: mlehmann AT universe DOT sgh-net DOT de 24 Jan 1998 23:50:49 +0100 (CET) From: Ronald Wahl X-Sender: rwa AT goliath DOT csn DOT tu-chemnitz DOT de To: Holger Burbach cc: Jack Duan , Brian Makin , beastium-list AT Desk DOT nl Subject: Re: PGCC optimizing AMD K6? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: Marc Lehmann Status: RO X-Status: A Content-Length: 1566 Lines: 40 On Wed, 21 Jan 1998, Holger Burbach wrote: > On Wed, 21 Jan 1998, Ronald Wahl wrote: > > > No test w/o -funroll-loops? I've had discovered that it will result in > > slower code. Maybe this has changed with pgcc/egcs-980115 but it would be > > nice to see the results. > > Okay, here they are! > [...] Thanx... Since pgcc-980122 is out, can you verify that -ffast-math (w/o funroll-loops) slows down some integer benches? The neural net bench still doesn't return if -funroll-loops or -funroll-all-loops is used. Has anybody checked if this is a problem of egcs or only pgcc? Maybe we should report this problem on the egcs-list... But anyway it's nice to see that -funroll-loops is a big win... ron PS: If you send benches it would be nice to see how -ffast-math influences the results. At the moment it's not generally a win and seem to slow down integer code (how this?). PPS (for Marc): Since I've seen many fxch instructions in the assembly output of nbench I have to note that these will not improve performance like on a pentium. If it's possible we should remove these. Minimizing the number of fpu instructions should be one of the goals on a K6 since most of these have a latency of 2 cycles and need two cycles to execute. -- \ Ronald Wahl --- rwa AT informatik DOT tu-chemnitz DOT de \ \ WWW: http://www.tu-chemnitz.de/~row \ \ Talk: rwa AT goliath DOT csn DOT tu-chemnitz DOT de \ \ PGP key available by finger to my email address \