From: leathm AT solwarra DOT gbrmpa DOT gov DOT au (Leath Muller) Message-Id: <199703030209.MAA08035@solwarra.gbrmpa.gov.au> Subject: Re: Netlib code [was Re: flops...] To: jbennett AT ti DOT com Date: Mon, 3 Mar 1997 12:09:05 +1000 (EST) Cc: djgpp AT delorie DOT com In-Reply-To: <5fd1a8$ag6$2@superb.csc.ti.com> from "Jesse Bennett" at Mar 2, 97 11:08:56 pm Content-Type: text > This is a very simple function (but also very important in numerical > applications). Understanding how to coerce GCC into producing near > optimal code (without obfuscating the source) for the matrix > multiplication problem would be very beneficial to my work, since the > required "tricks" should be widely applicable to my code. I would > like to hear any further thoughts or ideas on this subject. The thing I have found about the FPU code generated by gcc is nothing short of weird... :) I am coding for the pentium, and as a result have started converting my entire texture mapping routines to asm and doing the optimizations myself because DJGPP does a lot of repetitive things. Look at the code generated everytime you want to store an integer. Question: Is there any way to let DJGPP know I am running in single precision mode, and stop doing all the crap it does everytime I wan't to store an integer? It would save a lot of hassle. I don't have the code on me, so this is from memory, but to do a fistp myself running in single precision, I normally have something like: flds _a; fmuls _b; fadds _c; fistpl _d; Whereas DJGPP goes: flds _a; fmuls _b; fadds _c; fstcw -4(%ebp); fnldcw -8(%ebp); fistp _d; fldcw -4(%ebp); etc etc, only with a _lot_ more crap in the middle. Basically, I want to inform DJGPP _not_ to load and store the FPU control word information as there is no need! Especially when I am doing a lot of fistp's, as this just kills performance... Leathal.