From: Philip S Galbiati Newsgroups: comp.os.msdos.djgpp Subject: Re: Fixed Point (Optimization) Date: Mon, 6 Jan 1997 13:28:52 -0800 Organization: Tektronix, Inc, Beaverton, OR, USA Lines: 131 Message-ID: References: <32cd6b2c DOT 4726585 AT nntp DOT southeast DOT net> NNTP-Posting-Host: mdjunior.cse.tek.com Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII In-Reply-To: <32cd6b2c.4726585@nntp.southeast.net> To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp On Fri, 3 Jan 1997, Murray Stokely wrote: > > Is Fixed-Point math still a neccesity for today's computers with > all their built it FPU's, etc? How much faster can it be? Anyway, > this is the same code procedure from our long thread on optimization; > I've recently come back to it, because I still see room for > improvement. What can be done to it. Fixed point? Lookup tables? Same things as before -- reuse calculations, and tighten up your inner loops. The calculations you do for the upper RH quadrant (x>0, y>0) of your lens have an awful lot in common with the calculations for the other 3 quadrants, but you start over from scratch. If you were to take advantage of this commonality, you should be able to speed up the total calculation by almost a factor of 4. > void calculate_tfm(int diameter, char magnification) > { > int a,b,x,y,z,s,x2,y2; > int radius,r2; > z=0; > radius=diameter/2; These next two statements don't look terribly efficient... > s=round(sqrt(abs((radius-magnification) * > (radius+magnification)))); > s=s*s; ...but since they are outside BOTH loops, it probably doesn't cost much. > r2=radius*radius; > for(y=-radius;y { > y2=y*y; > for(x=-radius;x { > x2=x*x; > if ((x2 + y2) >= s) ^^^^^^^ You can reuse this value in your calculation of z -- see below. > { > a=x; > b=y; > } else { > z=round(sqrt(r2-x2-y2)); z = round (sqrt (r2 - (x2 + y2))); OR z = round (sqrt ((r2 - y2) - x2)); will allow CSE elimination to use the previously calculated value, thus saving a subtract inside the inner loop. Also, are you sure you need to round here? If not, that would save a function call inside the inner loop. > a=round(x*magnification/z); > b=round(y*magnification/z); ^^^^^^^^^^^^^^^^^^^^^^^^ This is a loop invariant for the inner (x) loop, but since there are function calls involved, it can't be optimized away (the compiler has no way of knowing that round() and sqrt() are side-effect free). Move it outside. > } > > tfm[(y+radius)*diameter+(x+radius)]=(b+radius)*diameter+(a+radius); ^^^^^^^^^^^^^^^^^^^ ^^^^^^ ^^^^^^^^^^^^^^^^^^^ ^^^^^^ Group the loop invariants for more CSE elimination: tfm[((y+radius)*diameter+ radius) + x]=((b+radius)*diameter+radius) + a); Here's where you can take advantage of the common calculations for the different quadrants. Adjust your loop bounds accordingly. tfm[((-y+radius)*diameter+radius)+x]=((-b+radius)*diameter+radius)+a); tfm[((y+radius)*diameter+radius)-x] =((b+radius)*diameter+radius)-a); tfm[((-y+radius)*diameter+radius)-x]=((-b+radius)*diameter+radius)-a); Bear in mind that these three calculations might be off a little, depending on how you are rounding, but you get the idea. > } // end of for x > } // end of for y > } Also, you might be able to speed things up by moving the tfm[] calculations INSIDE the if statement (although this would replicate some code): ... for (y=....) { for (x=....) { if ((x2 + y2) >= s) { tfm[...] = .... ... } else { tfm[...] = ... ... } } } or better still, calculate the value of x which causes the test to fail, and then have two loops in series (thereby eliminating the if statement altogether): ... for (y = ...) { xMid = sqrt (s - y2) or something; for (x=0; x < xMid; x++) { tfm[...] = ... ... } for (x = xMid; x < radius; x++) { tfm[...] = ... ... } } You will have to balance the cost of calculating xMid with the savings of doing so. Hope this helps. --Phil Galbiati ================================================= Any opinions expressed here reflect the ignorance of the author, NOT Tektronix. =================================================