Message-ID: <32BDBC44.49C6@pobox.oleane.com> Date: Sun, 22 Dec 1996 23:55:00 +0100 From: Francois Charton Organization: CCMSA MIME-Version: 1.0 To: djgpp AT delorie DOT com CC: Eli Zaretskii Subject: Re: Is DJGPP that efficient? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Eli Zaretskii wrote: > > On Fri, 20 Dec 1996, Francois Charton wrote: > > > > and by > > x2=x*x; > > co=1.0+x2*(-0.4999999963 + x2*(0.0416666418 + x2*(-0.0013888397 + > > x2*(0.0000247609 - x2*0.0000002605)))); > > > > Any serious general-purpose fp code cannot assume that the > argument is between 0 and PI/2, and most of the time of the library > functions is spent in the so-called argument reduction process (which > brings the argument to a narrow region around 0 where a simple > approximation can be used). Yes. I suggested this function as a way to speed up an application which *really* needs it, not as a general purpose replacement to the cos() function (good enough). However, the [0,PI/2] restriction (which is actually [-PI/2,PI/2] as the cosine is an even function) is not so silly, say like a square root which would stop at 5... In many applications, it is fairly easy to rewrite your code so that it guarantees that the argument stays in the bounds. This increases the burden of the programmer, but then gain in speed can be worth the bother. Finally, if my first example was, well just an example, the second one is a serious formula, and is very easy to extend to a larger domain: here is a "valid everywhere" cos() function, which on my 486DX4, compiled with DJGPP -O3 runs about 10-15% faster than the libc and libm cos() functions. #define MYIPI_S2 0.6366197724 #define MYPI_S2 1.57079632679 double mycos(double f) { double f2; double co; int i1; i1=(int)(f*MYIPI_S2); if(i1&1) i1++; f-=i1*MYPI_S2; f2=f*f; co=1.0+f2*(-0.4999999963 + f2*(0.0416666418 + f2*(-0.0013888397 + f2*(0.0000247609 - f2*0.0000002605)))); return ((i1&2)?-co:co); } BTW, I noticed one funny thing when working on this example: if, instead of "doubles", I use floats (32 bit, lower precision, better aligned...), mycos() runs slower... It seems to be due to the FPU, which loses time converting floats to ints. Is it DJGPP specific, or common to any "Intel Inside" (you have been warned) machine? Regards, and Joyeux Noel Francois