www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/04/30/20:03:30

From: leathm AT solwarra DOT gbrmpa DOT gov DOT au (Leath Muller)
Message-Id: <199704302348.JAA11429@solwarra.gbrmpa.gov.au>
Subject: Re: Alignment
To: wapex AT silesia DOT top DOT pl (Michal)
Date: Thu, 1 May 1997 09:48:35 +3400 (EST)
Cc: djgpp AT delorie DOT com
In-Reply-To: <3367B958.88@silesia.top.pl> from "Michal" at Apr 30, 97 11:27:52 pm

> As far as I know double and single operations are the same speed on
> pentium. The only instruction, which is faster in single precision is
> fdiv, but it takes more time to put 8 pixels(I interpolate u & v in
> texture and light value lineary every 8 pixels) then to exectute double
> precision fdiv. With double I can use some tricks, and have better
> precision.
 
No - your wrong... :)  The fdiv, sqrt, fmul, fadd and fsub are all affected
by moving the FPU into single precision mode...

I also get the impression then that your texturing 8 pixels, lighting 8
pixels, texturing 8 pixels, lighting... etc ... Basically, this is _really_
bad for cache coherency - your better off texturing the complete scanline
and then lighting the complete scanline. I moved to this way with using a
temporary offscreen memory buffer of 2560 bytes (I do stuff in true colour).
Write the texture stuff to the offscreen memory (which in my inner loop
never left the 8k cache area per line), and then do your lighting from there...

If your wondering, I had my perspective correct, sub-pixel accurate true
colour light-sourced, gouraud shaded engine running at 16 cycles per pixel.
With MMX registers, I could get it running in 9 cycles per pixel... which
is faster than Quake and looks a whole lot better...

Leathal.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019