From: ao950 AT FreeNet DOT Carleton DOT CA (Paul Derbyshire) Newsgroups: comp.os.msdos.djgpp Subject: Re: Loop unrolling: Don't bother Date: 1 Mar 1997 06:36:49 GMT Organization: The National Capital FreeNet Lines: 43 Message-ID: <5f8iq1$cor@freenet-news.carleton.ca> References: <5f5knj$cho AT freenet-news DOT carleton DOT ca> <5f61hc$nkg AT flex DOT uunet DOT pipex DOT com> <331702EF DOT 8EA AT rpi DOT edu> Reply-To: ao950 AT FreeNet DOT Carleton DOT CA (Paul Derbyshire) NNTP-Posting-Host: freenet2.carleton.ca To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Brian Osman (osmanb AT rpi DOT edu) writes: > nikki wrote: >> >> hardly a great surprise seeing as the loop above would quite probably fit in >> the cache when well optimised, but unrolled would thrash it horribly. >> unrolling loops won't save an enormous amount of time, after all a jump >> instruction will only take you 3 or 4 cycles at most. >> >> nik >> >> -- >> Graham Tootell >> nikki AT gameboutique DOT com > > Bear in mind that in many of the newer processes (ie PPro) which > use predictive branching, branches are one of the single worst > instructions. A mispredicted branch means that all of the pipeline, > and the cache has to be invalidated and flushed. Not pretty. > There are some cases where loop unrolling won't help much, but > it's still a valid and useful optimization technique. I don't > suppose -O3 is causing any unrolling? :) No, -O3 does inlining but not unrolling. And I don't trust -funroll-loops or -funroll-all-loops, because they might unroll loops with run-time-determined numbers of executions that will die very horribly if they run a few times more than they are supposed to, i.e. they will access a mallocked array out of its bounds or something. So, I did it manually. #define XLOOP stuff in the innermost loop #define YLOOP outer loop stuff and about twenty of XLOOP;XLOOP;XLOOP etc. #define ZLOOP about 14 of YLOOP; YLOOP etc. (All macros made "function-like" in leaving a semicolon off the last statement so ZLOOP; etc. would be correct syntax when expanded.) -- .*. Where feelings are concerned, answers are rarely simple [GeneDeWeese] -() < When I go to the theater, I always go straight to the "bag and mix" `*' bulk candy section...because variety is the spice of life... [me] Paul Derbyshire ao950 AT freenet DOT carleton DOT ca, http://chat.carleton.ca/~pderbysh