Mail Archives: djgpp/1993/04/26/23:26:09

www.delorie.com/archives/browse.cgi

search

Mail Archives: djgpp/1993/04/26/23:26:09

Date: 27 Apr 1993 13:13:39 +1100

From: Bill Metzenthen <APM233M AT vaxc DOT cc DOT monash DOT edu DOT au>

Subject: Re: Bug in float operation

To: djgpp AT sun DOT soe DOT clarkson DOT edu

Dr. Valery Fine  (IN%"FINE AT main2 DOT jinr DOT dubna DOT su") writes:

>  Hi Netters,
>
>  I have tried the short fortran code pointed below:
>
>
>        program ttt
>        XMIP= .5895782470703125E+03
>
>        TXMIP =XMIP
>        TXMIP2=XMIP*XMIP
>
>        TXM=TXMIP*TXMIP
>
>        TXMERR=TXMIP2 - TXMIP*TXMIP
>        TXMER2=txmip2 - txm

[stuff deleted]

>   It is clear that multiply and subtract operations have been done by
>co-processor with a higher accuracy that one needs, but I cannot figure out
>how to work around this situation along big code I am trying to port
>(I mean CERN GEANT simulation code - about 200 thousand fortran statements).

Your code requires two floating point numbers to be equal. This is
unwise in any programming language, and especially so in the 'C'
language.

There are two problems which affect your code:
(1) 'C' floating point expressions are normally evaluated at double
    precision. To quote from Kernigan and Ritchie (1978 edition):
    "All floating arithmetic in C is carried out in double-precision:
    whenever a float appears in an expression it is lengthened to double
    by zero-padding its fraction. When a double must be converted to float,
    for example by an assignment, the double is rounded before truncation
    to float length".
(2) I believe that go32 runs the coprocessor at full precision (64 bits
    rather than the 53 bits of a double). Therefore many sub-expressions
    may be evaluated at even higher precision than double. (note also
    that the current 387 emulators for djgpp are only capable of running
    at the full 64 bits precision).

If you are not using one of the 80387 emulators then you should be
able to work around these particular problems by setting the
coprocessor precision control (PC) bits (but this will be tedious
unless all of your variables have the same precision). For example, if
you declare all of the floating point numbers in the program to be
doubles then you would just need to add simple code at the start of
your program to set the FPU PC bits to double precision.

I should mention that there are other problems which can arise if
tests are made for floating point equality. You may also need to
compile the program without optimisations to avoid these other
problems.

The best solution would be to fix all instances where the code relies
explicitly or implicitly upon the equality of floating point numbers ;-)

--Bill

- Raw text -

webmaster	delorie software privacy
Copyright © 2019 by DJ Delorie	Updated Jul 2019

Date:	27 Apr 1993 13:13:39 +1100
From:	Bill Metzenthen <APM233M AT vaxc DOT cc DOT monash DOT edu DOT au>
Subject:	Re: Bug in float operation
To:	djgpp AT sun DOT soe DOT clarkson DOT edu