To: djgpp AT delorie DOT com Subject: Re: Loop unrolling: Don't bother Message-ID: <19970301.095133.4935.1.chambersb@juno.com> References: <5f5knj$cho AT freenet-news DOT carleton DOT ca> <5f61hc$nkg AT flex DOT uunet DOT pipex DOT com> <5f8ii9$cia AT freenet-news DOT carleton DOT ca> From: chambersb AT juno DOT com (Benjamin D Chambers) Date: Sat, 01 Mar 1997 12:48:37 EST On 1 Mar 1997 06:32:41 GMT ao950 AT FreeNet DOT Carleton DOT CA (Paul Derbyshire) writes: >I was thinking of the jump, plus the test for end of loop. >As for caching, I'm looking to improve speed on a 486. Yes, one >of those pre-Pentium dinosaurs that are dwindling in population and >are on >the endangerted species list along with such uncommon specimens as >Amigas >and Macintoshes and the 8-bit Nintendo, but are not yet extinct. Some >end >users of my program (not to mention the developer ;)) >might have such >oddball museum-pieces laying around, with their lack of caches, and >lack of >pipelining, and lack of certain machine instructions...;) I figger on >any >reasonably recent Pentium, the speed of the program will be limited by >the >builtin "brake" limiting it to twenty main loops (different loop! the >loops >being unrolled are run in their entirety every main loop) more than by >any >caching, lack thereof, or by those loops. You're not the only one with a 486 :) Just a few points, though: Both the 486 and Pentium have an 8k cache The 486 also implements pipelining, just not to the extant that the Pentium does (ie the EU on the 486 only processes one instruction at a time, while the Pentium, having two IU's, can do two). The 486 can run any software (MMX Included) that the Pentium can. In fact, so can the 386. You just need to trap for the invalid opcode interrupt, and emulate with software (similar to a FPU emulator). Not very practical, but it CAN be done (although the only MMX emulators are likely to be written for the Pentium, rather than 3/486). ...Chambers