X-pop3-spooler: POP3MAIL 2.1.0 b 3 961213 -bs- Delivered-To: pcg AT goof DOT com Date: Tue, 14 Apr 1998 02:30:18 +0200 (CEST) From: Ronald Wahl X-Sender: rwa AT goliath DOT csn DOT tu-chemnitz DOT de To: beastium-list AT Desk DOT nl Subject: [performance] newer binutils / pgcc / K6 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: Marc Lehmann Status: RO Content-Length: 7916 Lines: 151 Hi, I noticed a performance problem starting with release 2.8.1.0.26 of binutils. If I run nbench on my K6 with binutils 2.8.1.0.26 (or higher) some tests will be slower. My first thoughts were that it had something to do with the changes hjl made from 2.8.1.0.25 to 2.8.1.0.26 but after some further testing I found out that it is a code alignment issue. If I use -malign-loops=2 the tests run nearly at the same speed as with the older versions of binutils (gas). Some tests are a bit slower but not much (--> see my appended nbench results). Other alignments will cause slowdowns. Before changing any defaults for loop alignment on a K6 in pgcc - is someone willing to play a bit with old and new releases of binutils and some other benchmarks or real world applications? ron -- \ Ronald Wahl --- rwahl AT gmx DOT net \ Gib Gates keine Chance! / \ WWW: http://www.tu-chemnitz.de/~row/ \ / \ Talk: rwa AT goliath DOT csn DOT tu-chemnitz DOT de \ Pinguine schuetzen. / \ PGP key available \ / --------------------- snip --------------------------------- ******************************************************************* version of binutils: 2.8.1.0.25 relevant gcc-options: -O6 -ffast-math -mamdk6 ******************************************************************* TEST : Iterations/sec. : Old Index : New Index : : Pentium 90* : AMD K6/233* --------------------:------------------:-------------:------------ NUMERIC SORT : 109.74 : 2.81 : 0.92 STRING SORT : 12.935 : 5.78 : 0.89 BITFIELD : 2.2187e+07 : 3.81 : 0.79 FP EMULATION : 6.68 : 3.21 : 0.74 FOURIER : 1234 : 1.40 : 0.79 ASSIGNMENT : 0.95603 : 3.64 : 0.94 IDEA : 213.04 : 3.26 : 0.97 HUFFMAN : 95.822 : 2.66 : 0.85 NEURAL NET : 1.0866 : 1.75 : 0.73 LU DECOMPOSITION : 23.895 : 1.24 : 0.89 ==========================ORIGINAL BYTEMARK RESULTS========================== INTEGER INDEX : 3.485 FLOATING-POINT INDEX: 1.447 Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0 ==============================LINUX DATA BELOW=============================== C compiler : gcc version pgcc-2.90.27 980315 (egcs-1.0.2 release) libc : libc.so.5.4.44 MEMORY INDEX : 0.875 INTEGER INDEX : 0.866 FLOATING-POINT INDEX: 0.803 Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38 * Trademarks are property of their respective holder. ******************************************************************* version of binutils: 2.8.1.0.25 relevant gcc-options: -O6 -ffast-math -mamdk6 -malign-loops=2 ******************************************************************* TEST : Iterations/sec. : Old Index : New Index : : Pentium 90* : AMD K6/233* --------------------:------------------:-------------:------------ NUMERIC SORT : 106.67 : 2.74 : 0.90 STRING SORT : 13.081 : 5.85 : 0.90 BITFIELD : 1.9335e+07 : 3.32 : 0.69 FP EMULATION : 6.5312 : 3.13 : 0.72 FOURIER : 1228.8 : 1.40 : 0.78 ASSIGNMENT : 1.0074 : 3.83 : 0.99 IDEA : 206.76 : 3.16 : 0.94 HUFFMAN : 91.71 : 2.54 : 0.81 NEURAL NET : 1.1368 : 1.83 : 0.77 LU DECOMPOSITION : 25.049 : 1.30 : 0.94 ==========================ORIGINAL BYTEMARK RESULTS========================== INTEGER INDEX : 3.388 FLOATING-POINT INDEX: 1.491 Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0 ==============================LINUX DATA BELOW=============================== C compiler : gcc version pgcc-2.90.27 980315 (egcs-1.0.2 release) libc : libc.so.5.4.44 MEMORY INDEX : 0.854 INTEGER INDEX : 0.839 FLOATING-POINT INDEX: 0.827 Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38 * Trademarks are property of their respective holder. ******************************************************************* version of binutils: 2.9.0.2 relevant gcc-options: -O6 -ffast-math -mamdk6 ******************************************************************* TEST : Iterations/sec. : Old Index : New Index : : Pentium 90* : AMD K6/233* --------------------:------------------:-------------:------------ NUMERIC SORT : 101.75 : 2.61 : 0.86 STRING SORT : 13.226 : 5.91 : 0.91 BITFIELD : 1.7581e+07 : 3.02 : 0.63 FP EMULATION : 6.6823 : 3.21 : 0.74 FOURIER : 1247.5 : 1.42 : 0.80 ASSIGNMENT : 0.96749 : 3.68 : 0.95 IDEA : 205.05 : 3.14 : 0.93 HUFFMAN : 93.914 : 2.60 : 0.83 NEURAL NET : 1.1407 : 1.83 : 0.77 LU DECOMPOSITION : 25.862 : 1.34 : 0.97 ==========================ORIGINAL BYTEMARK RESULTS========================== INTEGER INDEX : 3.324 FLOATING-POINT INDEX: 1.516 Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0 ==============================LINUX DATA BELOW=============================== C compiler : gcc version pgcc-2.90.27 980315 (egcs-1.0.2 release) libc : libc.so.5.4.44 MEMORY INDEX : 0.819 INTEGER INDEX : 0.837 FLOATING-POINT INDEX: 0.841 Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38 * Trademarks are property of their respective holder. ******************************************************************* version of binutils: 2.9.0.2 relevant gcc-options: -O6 -ffast-math -mamdk6 -malign-loops=2 ******************************************************************* TEST : Iterations/sec. : Old Index : New Index : : Pentium 90* : AMD K6/233* --------------------:------------------:-------------:------------ NUMERIC SORT : 107.46 : 2.76 : 0.91 STRING SORT : 13.338 : 5.96 : 0.92 BITFIELD : 2.2182e+07 : 3.81 : 0.79 FP EMULATION : 6.5312 : 3.13 : 0.72 FOURIER : 1221.2 : 1.39 : 0.78 ASSIGNMENT : 0.9992 : 3.80 : 0.99 IDEA : 205.86 : 3.15 : 0.93 HUFFMAN : 91.912 : 2.55 : 0.81 NEURAL NET : 1.1368 : 1.83 : 0.77 LU DECOMPOSITION : 25.825 : 1.34 : 0.97 ==========================ORIGINAL BYTEMARK RESULTS========================== INTEGER INDEX : 3.463 FLOATING-POINT INDEX: 1.503 Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0 ==============================LINUX DATA BELOW=============================== C compiler : gcc version pgcc-2.90.27 980315 (egcs-1.0.2 release) libc : libc.so.5.4.44 MEMORY INDEX : 0.898 INTEGER INDEX : 0.840 FLOATING-POINT INDEX: 0.833 Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38 * Trademarks are property of their respective holder.