Message-Id: <200005141011.GAA13116@delorie.com>
From: "Dieter Buerssner" <buers AT gmx DOT de>
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
Date: Sun, 14 May 2000 13:19:50 +0200
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: Math functions
CC: djgpp-workers AT delorie DOT com
In-reply-to: <200005132121.RAA16244@indy.delorie.com>
References: <200005121903 DOT PAA29374 AT delorie DOT com> (buers AT gmx DOT de)
X-mailer: Pegasus Mail for Win32 (v3.12b)
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com
Precedence: bulk

On 13 May 00, at 17:21, Eli Zaretskii wrote:

> > From: "Dieter Buerssner" <buers AT gmx DOT de>
> > From my testing, the floating point functions built into the FPU are 
> > normally better than 2^-63 (when called with an argument correctly 
> > reduced to the supported range). 
> 
> What about functions that aren't built into the FPU?  They are the
> majority, IIRC.

I think, the majority directly depends on the intrinsic functions. 
These are the trig functions, that depend on correct argument 
reduction. I am doing this currently with about 168 bits of pi. I 
haven't found any errors up to arguments of 2^64*pi/2. After that, I 
am giving up. The functions will have max errors of about 2^-63.
I also have checked the first few thousand critical values around 
integer multiplies of pi/2, which all got the correct results up to 
the last bit for sin and cos and an max error of 2^-63 for tan. (Note 
that tanl() must never return Infinity, because the floating point 
argument will never be that close to n*pi/2.)

The hyperbolic functions depends only on expl and expm1l. The inverse 
hyperbolic functions depend on logl, log1pl and sqrtl. Sqrtl seems to 
be perfectly implemented in my FPU (not surprisingly). So these 
depend also (more or less) only on the FPU intrinsic functions plus 
argument reduction code for the exp* functions. In expl, and exp10l I 
use 16/32 additional bits for argument reduction, which seems enough 
for the whole range.

Some low level functions like ldexpl, frexpl, rintl are direct FPU 
upcodes. There seems not much left then ...

Powl I find extremely difficult. Perhaps Eric has a better solutions.

I have only "collected" the error, gamma and Bessel functions, and 
slightly modified them. The Bessel functions may only have a good 
absolute precision for small results, the others do quite well.

Here are a few snippets, of my log (including the problematic powl). 
I can sent to anybody interested a log of about 50 kB.

Regards, Dieter Buerssner

Test of sin versus qsin with 500000 random arguments
arg1 min -7.85398163397448309616E-1 max  7.85398163397448309616E-1 
sin was bigger  than qsin    6981 times
sin was smaller than qsin    6948 times
sin was the same as  qsin  486071 times
rms relative error      1.3553E-20 = 2^-66.00
average relative error  2.2277E-21 = 2^-68.60
max relative error      1.0842E-19 = 2^-63.00
    for arg -5.23621044557897090616E-1 erg -5.00019285360749671308E-1 
erg2 -5.00019285360749671362E-1

Test of sin versus qsin with 500000 random arguments
arg1 min  1.0E0 max  3.162277660168379332E9 exponentially scaled
sin was bigger  than qsin   40449 times
sin was smaller than qsin   41067 times
sin was the same as  qsin  418484 times
rms relative error      3.0900E-20 = 2^-64.81
average relative error  1.2243E-20 = 2^-66.15
max relative error      1.0842E-19 = 2^-63.00
    for arg  1.38298144722152674184E4 erg  5.00010684463143306321E-1 
erg2  5.00010684463143306267E-1

Test of sinh versus qsinh with 500000 random arguments
arg1 min  1.0E-4000 max  1.0E0 exponentially scaled
sinh was bigger  than qsinh     610 times
sinh was smaller than qsinh     227 times
sinh was the same as  qsinh  499163 times
rms relative error      3.6133E-21 = 2^-67.91
average relative error  1.4113E-22 = 2^-72.59
max relative error      2.2584E-19 = 2^-61.94
    for arg  6.86746645952087874822E-7 erg  6.86746645952141855646E-7 
erg2  6.86746645952141855491E-7

Test of exp10 versus qexp10 with 500000 random arguments
arg1 min -4.0E3 max  4.0E3 
exp10 was bigger  than qexp10   33381 times
exp10 was smaller than qexp10   31284 times
exp10 was the same as  qexp10  435335 times
rms relative error      2.7361E-20 = 2^-64.99
average relative error  9.7426E-21 = 2^-66.48
max relative error      1.0815E-19 = 2^-63.00
    for arg -1.56294666647140285654E3 erg  1.13066390630906302134E-
1563 erg2  1.13066390630906302121E-1563

Test of pow versus qpow with 500000 random arguments
arg1 min  1.0E-2 max  1.0E2 exponentially scaled
arg2 min -1.5E2 max  1.5E2 
pow was bigger  than qpow  110186 times
pow was smaller than qpow  135721 times
pow was the same as  qpow  254093 times
rms relative error      8.3855E-20 = 2^-63.37
average relative error  5.1264E-20 = 2^-64.08
max relative error      7.1855E-19 = 2^-60.27
    for arg  1.17620625640315757018E-2 arg2  1.46538342215057502002E2 
erg  1.78586504349530754355E-283 erg2  1.78586504349530754483E-283

Test of pow versus qpow with 500000 random arguments
arg1 min  1.0E-3 max  1.0E3 exponentially scaled
arg2 min -1.5E3 max  1.5E3 
pow was bigger  than qpow  204107 times
pow was smaller than qpow  214094 times
pow was the same as  qpow   81799 times
rms relative error      7.4146E-19 = 2^-60.23
average relative error  4.4546E-19 = 2^-60.96
max relative error      7.1106E-18 = 2^-56.96
    for arg  3.12092399496075727540E1 arg2  1.28488698656671138765E3 
erg  9.66113111451765511525E1919 erg2  9.66113111451765504655E1919

Test of erf versus qerf with 500000 random arguments
arg1 min -1.0E1 max  1.0E1 
erf was bigger  than qerf   48623 times
erf was smaller than qerf   49169 times
erf was the same as  qerf  402208 times
rms relative error      2.8503E-20 = 2^-64.93
average relative error  1.2154E-20 = 2^-66.16
max relative error      2.1655E-19 = 2^-62.00
    for arg  1.11387508305152661973E-1 erg  1.25169464576545015264E-1 
erg2  1.25169464576545015237E-1

Test of jn versus qjn with 50000 random arguments
arg1 min -2.0E1 max  2.0E1 
arg2 min -8.0E0 max  8.0E0 
rel max  5.2485E-13 = 2^-40.79
    for arg -2.00000000000000000000E0 arg2 -6.66897084846697656752E-4 erg  5.55939631616394688840E-8 erg2  5.55939631616686475606E-8
jn was bigger  than qjn   25131 times
jn was smaller than qjn   17374 times
jn was the same as  qjn    7495 times
rms relative error      2.3653E-15 = 2^-48.59
average relative error  1.2789E-17 = 2^-56.12
max relative error      5.2485E-13 = 2^-40.79
    for arg -2.00000000000000000000E0 arg2 -6.66897084846697656752E-4 
erg  5.55939631616394688840E-8 erg2  5.55939631616686475606E-8
rms absolute error      7.9256E-20 = 2^-63.45
average absolute error  2.9786E-20 = 2^-64.86
max absolute error      7.0473E-19 = 2^-60.30
    for arg -3.00000000000000000000E0 arg2  4.04837377974039736711E0 
erg -4.31945023521206303751E-1 erg2 -4.31945023521206303047E-1