From: Hartmut Schirmer Newsgroups: comp.os.msdos.djgpp Subject: Re: Optimization Date: Thu, 28 Nov 1996 10:49:06 +0100 Organization: Technische Fakultaet, University of Kiel, Germany Lines: 61 Message-ID: <329D6012.D0E@techfak.uni-kiel.de> References: NNTP-Posting-Host: zora.techfak.uni-kiel.de Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: Glen Miner DJ-Gateway: from newsgroup comp.os.msdos.djgpp Glen Miner wrote: > > > : I am currently working on a program that grinds away in a recursive > > : algorithm tens of thousands of times over a small group of local data. It > > : doesn't do _any_ io, and only calls a select few functions. Due to the > > : immense number of calculations required, this is painfully slow. I've > > : optimized the algorithm a fair amount, and plan to improve it more, but > > : that is only one part of the optimization process. > > > > Removing the recursion may improve the performance. [...snip...] > > I don't think that removing the recursion is a viable solution, and I > doubt that it would improve the algorithm's speed any: It's actually doing > a search of a very wide "tree" of "data". (Sorta like a chess move search > engine). > > The code in question makes no standard library calls, and restricts itself > to a small block of global data. There are maybe half a dozen small > functions that are used while it recurses, but they are all "home made" > and fairly well optimized. No function used requires more then a couple > of bytes of information, so there isn't much in the way of large memory > movement to worry about. Keep all these functions in the same file as your main recursion loop, mark them as 'static inline ...' and put a declaration of all these functions before any real code. GCC will happily insert them into the calling function if possible. (You may need to check wich inline operations will speedup the code and there may be some that will slow down it due to cache misses etc.) Since you won't need to debug, -fomit-frame-pointer may speed up your code by giving GCC another register. > How well will djgpp deal with two dimentional arrays? Say I'm stepping > through char MyArray[9][9] like so: > > for (x = 0; x < 6; x++) > for (y = 0; y < 7; y++) > MyArray [x][y] = SomeValue; On a P133: MyArray GCC Options P-133 486-33 char -O3 1005 7600 char -O6 -fomit-frame-pointer -funroll-loops 975 3000 short -O6 -fomit-frame-pointer -funroll-loops 925 3450 int -O6 -fomit-frame-pointer -funroll-loops 905 2600 -> Use int and -funroll-loops in such a case ! Note: GCC optimizes for 486 There's a bug in GCC: -fstrength-reduce is turned of by default. You may enable it for this special source file and check the results ... Hope it helps, Hartmut -- Hartmut Schirmer | Phone: +49-431-77572-709 FAX:-703 Automatisierungs- & Regelungstech. | hsc AT techfak DOT uni-kiel DOT de Technische Fakult"at, | http://www.techfak.uni-kiel.de/~hsc Kaiserstr. 2, 24143 Kiel, Germany | PGP key via WWW, Key ID:6D84AEC1