Date: Tue, 22 Jun 1999 11:15:10 -0500 From: Eric Rudd Subject: Re: libm sources from cyberoptics To: Eli Zaretskii Cc: djgpp-workers AT delorie DOT com Message-id: <376FB68E.671D8763@cyberoptics.com> Organization: CyberOptics MIME-version: 1.0 X-Mailer: Mozilla 4.05 [en] (Win95; U) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit References: Reply-To: djgpp-workers AT delorie DOT com Eli Zaretskii wrote: > It's the contradiction between these comments and the documentation of the FP > instructions that got me bogged. Me, too. I should say a little more about this pesky problem. I have been using as v2.8.1, but I obtained a copy of as v2.7, and dredged up my old copy of as v2.5.2, and made a complete examination of this problem. There is indeed a problem, but I was not complete in my description of it a week ago. The following instructions had the same anomalous behaviors in all versions I tested: fsub %st, %st(1) /* st(1) <= st - st(1) */ fsubr %st, %st(1) /* st(1) <= st(1) - st */ fdiv %st, %st(1) /* st(1) <= st / st(1) */ fdivr %st, %st(1) /* st(1) <= st(1) / st */ Similar reversals exist in the corresponding instructions that pop the stack. Confronted with this mess, and (at the time) not understanding exactly how the mnenomics were bungled, I tried reversing the args, which worked on as v2.8.1. Unfortunately, the mnenomics with reversed args are themselves incorrect, and work differently on each of the three different versions of as I tried. (Note that I am making a distinction between mnemonics, such as "fdivr", which one submits to the assembler, and opcodes, which the assembler emits. There's not a one-to-one correspondence here, because of assembler and Intel doc bugs!) The mnemonics listed above work differently from the general rules; I believe that this is so misleading that I have put in code to emit the opcodes directly in every place they occur. > Wouldn't using FSUB and FXCHG be a better way (albeit a slower one)? If, in the future, my routines are assembled with some assembler that doesn't have these bugs, it will relatively easy to fix up the routines, since one can change the mnemonics and check to see that the routines assemble to the same binaries. However, if I work around the problem by changing the desired opcodes, someone will have to understand the operation of the code in order to put it right. > But if you think that using opcodes is better, then go for it. I suggest > leaving a comment there explaining why this was done. I have done that. The comments indicate the mnemonic that should have been there, if gas had been written consistently. > > I need to put in proper stubs for these. > > No, you don't. The library build procedure does that automatically, > by reading and generating an assembly module for each > one of the #define's. OK, I didn't know about that mechanism. I started to edit stubs.h to include the non-ANSI routines, but stopped when I noticed that other non-ANSI functions, such as hypot, are not even in stubs.h. I'm not sure how we should clean up this mess. It seems to me that there are several classes of non-ANSI routines. There are those such as hypot and asinh, which are currently non-ANSI, but have been sanctioned by widespread usage, and, indeed, are specified in C9x. Any program that uses those names for incompatible functions will be in trouble when C9x issues. There are others, such as pow10, which are specified neither in C90 nor C9x, but are in widespread use (I'm tempted to include the exp10 alias in this category as well, by way of analogy to the exp2 function specified in C9x). And finally, there are some unique functions I cooked up, such as sincos and powi, which are not in widespread use, and will probably never make it into ANSI, but nonetheless satisfy a legitimate need. What should we do about the stubbing and prototyping of these various functions? > The docs I sent to you explains that people should not mess with the > rounding mode, but if they do and cannot restore it, they should link > with -lm (which doesn't depend on FPU's rounding mode) and pay the > price in slow-down and perhaps some inaccuracy. I looked into this, and even on the 387, the range of approximation is large enough that the rounding mode should not matter. Thus, you may remove cautions in the docs about the rounding mode. There were various issues raised with the operation of the math functions, so let me summarize what I've done about them: 1. Re-entrancy problems with a few routines. All routines are re-entrant now, to the best of my knowledge. 2. fdiv and fsub anomalies between different versions of gas. Fixed (see above). 3. When should ERANGE be set? I had the philosophy that ERANGE should signal a loss of precision due to over- or underflow. Thus, exp(1.E6) raises ERANGE, but exp(+inf) does not (since the result is exactly infinity, and can be represented without loss of precision). A few of the functions (such as hypot) did not raise ERANGE consistently, but they've been fixed. The hypot function was unexpectedly complicated (there was a 4x4 matrix of special cases), but it's finished now. The complications arose because I wanted to bias the tests in favor of normal arguments. Thus, the normal arguments pass through the function with a few very simple tests that weed out everything that *might* give trouble; the code to deal with seldom-occurring extreme arguments therefore doesn't have to be so efficient. The function therefore behaves correctly for all arguments, but with speed optimized for the common cases. 4. sqrt(-0.) now returns +0. without setting errno. The consensus was that this was best, since -0. compares equal to +0. I also made log(-0.) return -inf with ERANGE, rather than NaN with EDOM, for similar reasons, though I was less enthusiastic about making that change. I made corresponding changes to log2 and log10. 5. The question arose whether to silently pass NaN arguments, or to raise EDOM. It would be weeks of work to change this, since this question arises in every routine. I haven't seen any compelling reason not to raise EDOM, since NaN is certainly outside the domain of definition of all the functions. There are also good reasons to raise EDOM and return NaN for modf(inf) and fmod(inf, x), since infinity has an undefined fractional part (in spite of the C9x committee's claim that infinity is an even integer (!)). 6. There was some question about the behavior of ldexp, but I can't find anything wrong with it. 7. I don't recall any complaints, but I discovered that ceil(inf) and floor(inf) raised EDOM. I fixed that. If there are any other issues I have overlooked in this list, please let me know. If not, I'll upload the new routines. -Eric Rudd