From: buers AT gmx DOT de (Dieter Buerssner) Newsgroups: comp.os.msdos.djgpp Subject: Re: [long] gcc performance and possible bug Date: 8 Mar 2000 18:25:35 GMT Lines: 120 Message-ID: <8a65uu$39fkt$1@fu-berlin.de> References: NNTP-Posting-Host: pec-1-96.tnt1.s2.uunet.de (149.225.1.96) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: fu-berlin.de 952539935 3456669 149.225.1.96 (16 [17104]) X-Posting-Agent: Hamster/1.3.13.0 User-Agent: Xnews/03.02.04 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Eli Zaretskii) wrote: >Did you look at the generated assembly? That could provide important >clues. I slightly changed my source, to make the difference even more obvious. This is the context diff off gcc -O2 -S output of the two versions. *** const.s Wed Mar 8 19:03:10 2000 --- nonconst.s Wed Mar 8 19:04:52 2000 *************** *** 99,108 **** _zseed: .long -2023406815 .long 305419896 - .text .p2align 2 _mul.12: .long 999996864 .p2align 2 .globl _mwc32 _mwc32: --- 99,108 ---- _zseed: .long -2023406815 .long 305419896 .p2align 2 _mul.12: .long 999996864 + .text .p2align 2 .globl _mwc32 _mwc32: The only difference is, that the varible mul (_mul.12) is in the text segment for const and in the data segment otherwise (as you would suspect), and that the const version is much slower (factor of ten!). To exclude, that there may be a (hardware) problem with my system: Could please anybody try, to reproduce my results by compiling the following program with gcc -O2 mwc32tst.c and running a.exe, then uncomment the const close to the end of the listing and recompile and rerun. Please post or mail your results, maybe including your processor and versions of gcc and binutils (I have AMD K6-2, tried with various versions of gcc and binutils, including gcc 2.95.2 and binutils 2.9.5). Regards, Dieter /* mwc32tst.c */ #include #include #include #define CALLS (1UL << 27) /* Tune this as appropriate */ /* Call function pointed to by tr n times */ unsigned long speed_loop(unsigned long (*tr)(void), unsigned long n) { unsigned long s; s = 0; do s+=tr(); while (--n != 0); return s; } /* avoid inlining of these functions */ unsigned long dum_rand(void); unsigned long mwc32(void); /* test the speed of function mwc32, take function call and loop overhead into account */ int main(void) { clock_t anf, anfdum; unsigned long s, n = CALLS; anfdum = clock(); speed_loop(dum_rand, n); anfdum = clock() - anfdum; anf = clock(); s = speed_loop(mwc32, n); anf = clock() - anf; anf -= anfdum; printf("s=%lu, used %.5f usec/call (w.o call overhead)\n", s, 1e6/n*(double)anf/CLOCKS_PER_SEC); return 0; } unsigned long dum_rand(void) { return 0UL; } typedef unsigned long long ul64; static ul64 zseed = ((ul64)0x12345678UL<<32) | 0x87654321UL; /* Multiply with carry RNG */ unsigned long mwc32(void) { unsigned long l1, l2; ul64 res; /* Uncommenting the const can make this function much slower, depending on compiler switches and the phase of the moon :-) */ static /* const */ unsigned long mul=999996864UL; l1 = (unsigned long)(zseed & 0xffffffffUL); l2 = zseed>>32; res = l2+l1*(ul64)mul; zseed = res; return (unsigned long)(res & 0xffffffffUL); }