www.delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1999/05/19/05:04:34

Message-ID: <19990519105631.40676@atrey.karlin.mff.cuni.cz>
Date: Wed, 19 May 1999 10:56:31 +0200
From: Jan Hubicka <hubicka AT atrey DOT karlin DOT mff DOT cuni DOT cz>
To: pgcc AT delorie DOT com
Subject: Re: Benchmark PGCC vs EGCS on a K6-2
References: <373F3AA2 DOT A446D611 AT informatik DOT hu-berlin DOT de> <Pine DOT LNX DOT 4 DOT 10 DOT 9905181826020 DOT 1284-100000 AT data DOT mandrakesoft DOT com>
Mime-Version: 1.0
X-Mailer: Mutt 0.84
In-Reply-To: <Pine.LNX.4.10.9905181826020.1284-100000@data.mandrakesoft.com>; from Bernhard Rosenkraenzer on Tue, May 18, 1999 at 06:26:47PM +0000
Reply-To: pgcc AT delorie DOT com
X-Mailing-List: pgcc AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> On Sun, 16 May 1999, Jens-Uwe Rumstich wrote:
> 
> > - allthough EGCS does not have specific K6-options, the fastest
> > combination wsas found with EGCS, not with PGCC
> 
> Try pgcc with -O6 -mk6 -march=k6 -pipe -s -fno-exceptions -fno-rtti
> -fomit-frame-pointer

Hi
About year ago I've done some tunning of egcs for K6-2. I've removed some of
K6-2 specific optimizations, because they seemed to produce slower code. There
seems to be important problem in K6 documentation. It recommends thinks that often
causes performance loss. Author of original K6 stuff for egcs just blindly followed
their recommendations so many of his changes were performance miss (especially changking
xor reg,reg to mov reg,0)

After some tunning and adding few new optimizations I've avoided all slowdowns
in my benchmark suite compared to -mpentium and -mpentiumpro optimizations.
(suprisingly enought -m386 works very well for K6 too and there was still few cases
wherere it was win).

Many (not all) of this changes are in recent egcs snapshots (aka gcc 2.95.0). Because
I don't have any access to this CPU anymore, I would love to hear about your results with
this version of gcc.

K6 seems to have serious problems with decoding speed. I've made new haifa scheduler hooks for
decoding that worked quite well (I have also version for Pentium and PPro available, PPro
version is untested), 
On K6 it brought quite large speedups (-10 - 500%, usually about 10%), but changes necesarry
to i386.md are quite large so it would take lots of time to add them into gcc.

Honza
> 
> LLaP
> bero
> 
> 

-- 
                       OK. Lets make a signature file.
+-------------------------------------------------------------------------+
|        Jan Hubicka (Jan Hubi\v{c}ka in TeX) hubicka AT freesoft DOT cz         |
|         Czech free software foundation: http://www.freesoft.cz          |
|AA project - the new way for computer graphics - http://www.ta.jcu.cz/aa |
|  homepage: http://www.paru.cas.cz/~hubicka/, games koules, Xonix, fast  |
|  fractal zoomer XaoS, index of Czech GNU/Linux/UN*X documentation etc.  | 
+-------------------------------------------------------------------------+

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019