www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/01/06/21:32:18

From: Philip S Galbiati <galbiati AT mdhost DOT cse DOT tek DOT com>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Fixed Point (Optimization)
Date: Mon, 6 Jan 1997 13:28:52 -0800
Organization: Tektronix, Inc, Beaverton, OR, USA
Lines: 131
Message-ID: <Pine.SUN.3.95.970106121541.15116A-100000@mdjunior>
References: <32cd6b2c DOT 4726585 AT nntp DOT southeast DOT net>
NNTP-Posting-Host: mdjunior.cse.tek.com
Mime-Version: 1.0
In-Reply-To: <32cd6b2c.4726585@nntp.southeast.net>
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

On Fri, 3 Jan 1997, Murray Stokely wrote:
>
>      Is Fixed-Point math still a neccesity for today's computers with
> all their built it FPU's, etc?  How much faster can it be?  Anyway,
> this is the same code procedure from our long thread on optimization;
> I've recently come back to it, because I still see room for
> improvement.  What can be done to it.  Fixed point?  Lookup tables?

Same things as before -- reuse calculations, and tighten up your inner
loops.

The calculations you do for the upper RH quadrant (x>0, y>0) of your lens
have an awful lot in common with the calculations for the other 3
quadrants, but you start over from scratch.  If you were to take advantage
of this commonality, you should be able to speed up the total calculation
by almost a factor of 4.

> void calculate_tfm(int diameter, char magnification)
> {
>         int a,b,x,y,z,s,x2,y2;
>         int radius,r2;
>         z=0;
>         radius=diameter/2;

These next two statements don't look terribly efficient...

>         s=round(sqrt(abs((radius-magnification) *
> (radius+magnification))));
>         s=s*s;

...but since they are outside BOTH loops, it probably doesn't cost much.

>         r2=radius*radius;
>         for(y=-radius;y<radius;y++)
>         {
>                 y2=y*y;
>                 for(x=-radius;x<radius;x++)
>                 {
>                         x2=x*x;
>                         if ((x2 + y2) >= s)
                               ^^^^^^^
You can reuse this value in your calculation of z -- see below.

>                         {
>                                 a=x;
>                                 b=y;
>                         } else {
>                                 z=round(sqrt(r2-x2-y2));
                                  z = round (sqrt (r2 - (x2 + y2)));
OR
                                  z = round (sqrt ((r2 - y2) - x2));
will allow CSE elimination to use the previously calculated value, thus
saving a subtract inside the inner loop.

Also, are you sure you need to round here?  If not, that would save a
function call inside the inner loop. 

>                                 a=round(x*magnification/z);
>                                 b=round(y*magnification/z);
                                    ^^^^^^^^^^^^^^^^^^^^^^^^
This is a loop invariant for the inner (x) loop, but since there are
function calls involved, it can't be optimized away (the compiler has no 
way of knowing that round() and sqrt() are side-effect free).  Move it
outside.

>                         }
> 
> tfm[(y+radius)*diameter+(x+radius)]=(b+radius)*diameter+(a+radius);
      ^^^^^^^^^^^^^^^^^^^    ^^^^^^   ^^^^^^^^^^^^^^^^^^^    ^^^^^^
Group the loop invariants for more CSE elimination:
  tfm[((y+radius)*diameter+ radius) + x]=((b+radius)*diameter+radius) + a);

Here's where you can take advantage of the common calculations for the
different quadrants.  Adjust your loop bounds accordingly.
  tfm[((-y+radius)*diameter+radius)+x]=((-b+radius)*diameter+radius)+a);
  tfm[((y+radius)*diameter+radius)-x] =((b+radius)*diameter+radius)-a);
  tfm[((-y+radius)*diameter+radius)-x]=((-b+radius)*diameter+radius)-a);

Bear in mind that these three calculations might be off a little,
depending on how you are rounding, but you get the idea.

>                 } // end of for x
>         } // end of for y
> }

Also, you might be able to speed things up by moving the tfm[]
calculations INSIDE the if statement (although this would replicate some
code):
   ...
   for (y=....) {
      for (x=....) {
         if ((x2 + y2) >= s) {
            tfm[...] = ....
            ...
         }
         else {
            tfm[...] = ...
            ...
         }
      }
   }

or better still, calculate the value of x which causes the test to fail,
and then have two loops in series (thereby eliminating the if statement
altogether):

   ...
   for (y = ...) {
      xMid = sqrt (s - y2) or something;
      for (x=0; x < xMid; x++) {
         tfm[...] = ...
         ...
      }
      for (x = xMid; x < radius; x++) {
         tfm[...] = ...
         ...
      }
   }

You will have to balance the cost of calculating xMid with the savings of
doing so.

Hope this helps.

--Phil Galbiati
=================================================
     Any opinions expressed here reflect the
     ignorance of the author, NOT Tektronix.
=================================================

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019