www.delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/2000/06/13/22:26:36

Sender: nuetzel AT delorie DOT com
Message-ID: <3946ED8D.7EF4134E@myokay.net>
Date: Wed, 14 Jun 2000 04:27:25 +0200
From: Dieter =?iso-8859-1?Q?N=FCtzel?= <dieter DOT nuetzel AT myokay DOT net>
Organization: DN
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.4.0-test1-ac7-cl31 i686)
X-Accept-Language: de-DE, de, en-US, en-GB, en
MIME-Version: 1.0
To: pgcc AT delorie DOT com
CC: Marc Lehmann <pcg AT opengroup DOT org>
Subject: Athlon -- has someone the doku handy?
Reply-To: pgcc AT delorie DOT com

Hello,

the are on (for Athlon stepping 1) and two (for Athlon stepping 2) flag
names missing in the current linux kernel (2.4.0-test1-ac18). Alan Cox
and I are very pleased if someone of you have the AMD Athlon Programming
Ref handy?!


Took some time to get the CD from AMD...

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 1
model name      : AMD-K7(tm) Processor
stepping        : 2
cpu MHz         : 548.952604
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
sep_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov 16 mmxext mmx 3dnowext 3dnow
bogomips        : 1094.45

16 ?


Now Athlon 800 (nice thing:-)

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 2
model name      : AMD Athlon(tm) Processor
stepping        : 1
cpu MHz         : 798.470512
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
sep_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov 16 pse36 mmxext mmx 24 3dnowext 3dnow
bogomips        : 1592.52

16 & 24 ?

Thanks,
        Dieter

BTW I found the best optimization flags combination for the Athlon.
GCC-2.96 CVS didn't come close neither with same flags or mcpu=athlon
and/or march=athlon!!! :-( Why?

Especially -O (nothing), -mcpu=k6 and -mpreferred-stack-boundary=2 (2
!!!) is needed.

!!!-fomit-frame-pointer is worse!!! Don't use it as you can...

This is the best for an MFLOPS test (dgemm from Quant-X, Alpha FPU test,
source available).

-O -mcpu=k6 -mpreferred-stack-boundary=2 -malign-functions=4
-fschedule-insns2 -fexpensive-optimizations

K7-550

gcc -O -funroll-loops -DMAIN -o dgemm dgemm.c

SunWave1>./dgemm-O
m:1000 n:1000 k:1000
Ail_max 24, Blj_max 12, A_row_block 85
Shimizu's DGEMM : 147.493 MFLOPS(13.560 sec)
Shimizu's DGEMM : 147.493 MFLOPS(13.560 sec)
Shimizu's DGEMM : 147.601 MFLOPS(13.550 sec)

gcc -O -mcpu=k6 -mpreferred-stack-boundary=2 -malign-functions=4
-fschedule-insns2 -fexpensive-optimizations -DMAIN -o dgemm dgemm.c

SunWave1>./dgemm-k6
m:1000 n:1000 k:1000
Ail_max 24, Blj_max 12, A_row_block 85
Shimizu's DGEMM : 213.447 MFLOPS( 9.370 sec)
Shimizu's DGEMM : 213.220 MFLOPS( 9.380 sec)
Shimizu's DGEMM : 213.220 MFLOPS( 9.380 sec)

K7-800 got ~222 and ~288

Any questions?

--
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
Cognitive Systems Group
Vogt-Kölln-Straße 30
D-22527 Hamburg, Germany

email: nuetzel AT kogs DOT informatik DOT uni-hamburg DOT de
@home: dieter DOT nuetzel AT myokay DOT net

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019