X-pop3-spooler: POP3MAIL 2.1.0 b 4 980420 -bs- Date: Mon, 6 Jul 1998 19:19:36 +0200 (CEST) From: Andrea Arcangeli X-Sender: andrea AT penguin DOT e-mind DOT com To: Tuukka Toivonen cc: Linux Programming , linuxprog AT geeky1 DOT ebtech DOT net, beastium-list Subject: Re: passing args in regs speed (was:something else) In-Reply-To: Message-ID: X-Public-Key-URL: http://www-linux.deis.unibo.it/~mirror/aa.asc MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: Marc Lehmann Status: RO Content-Length: 2573 Lines: 68 On Mon, 6 Jul 1998, Tuukka Toivonen wrote: >Test program: bzip2 0.1pl2 > >I added function prototypes for all functions in the program >(and removed those already existing). I told the compiler >to use different amount of register parameters and then >compiled the program and measured how long it took to >compress uncompressed LyX 0.12.0 source tar file (7997440 bytes) >to /dev/null. Nice! >My test system: Pentium 120 MHz, 24 MB main memory, 32 MB >swap, Linux 2.0.34, gcc version 2.7.2. There were no other >active programs background eating CPU-time, but the >hard disk rotated few times showing that not everything >fit in the disk cache. OK. Don' t worry about cache since in the real world all is not in the cache but in the ideal world also the kernel would be compiled with -mregparm=3 ;-). >The tests show no significant speedup until I use all >3 registers, in which case it's about 6% faster. Cool! >Question: why gcc doesn't allow more than 3 registers >to be used?? x86 would have 7 or at least 6 free registers. I think that you can use only the register that gcc doesn' t save across call (eax/edx and ?!?)... >Each case first shows the used compiler flags, and then >the test run was made 4 times. The times are in real-time >seconds (measured using my own program using RDTSC instruction) >The last number is length of the stripped ELF executable >(so case 4 gives smallest executables). > >Patch for bzip and some more information is in file >http://www.ee.oulu.fi/~tuukkat/regpass-test.tar.gz Good! Remeber to put in #ifdef __i386__ (I have not read the patch though). >Considerations: >- All libc calls used conventional stack parameter passing > convention. This could be changed by breaking compatibility. >- Why kernel doesn't use register parameters?? It would be > ideal since it wouldn't break compatibility! Break the asm function in the arch specific code :-(, I just spent some hours to try to compile with -mregparm=1. Also the kernel don' t compile at all with -mregparm=3 since sometimes it need registers... I could spent some more time on this though (also looking at the great improvement you got with bzip!). >( I'm CCing this to pgcc list since I think those people >could be interested; maybe they could implement automatic >register passing for static functions?) Also egcs/gcc people could be interested but I think they are aware of that just now. Anyway the problem is still backwards compatibilty. The only piece of code that could use -mregparm=3 without major problems is the kernel. Andrea[s] Arcangeli