Date: Fri, 3 Feb 95 14:35 MST From: mat AT ardi DOT com (Mat Hostetter) To: THE MASKED PROGRAMMER Cc: djgpp AT sun DOT soe DOT clarkson DOT edu Subject: Re: 256x256 look up optimisation References: <0098B6F8 DOT 8C10FFC5 DOT 12 AT bsa DOT bristol DOT ac DOT uk> >>>>> "badcoe" == THE MASKED PROGRAMMER writes: badcoe> Sorry about the delay to this but I've been away having a badcoe> baby (or rather my wife has): Congratulations! I hope things went well. >> This is a much different statement than your original question, >> which unnecessarily brought up two dimensional arrays. badcoe> Surely a 256x256 array and a 64K array are the same thing badcoe> (unless you think of 256x256 as table[256][256] but I badcoe> thought I said that in the first posting (which may not be badcoe> the one you were reading)). The difference is that in one case your program may have two separate indices (one for x and one for y), which may be computed in any number of different ways; in the other case you have an array of 16 bit indices lying around. You can of course transform one to the other, but at a performance cost you can't afford. As a reductio ad absurdum, describing the problem as "I have a sixteen dimensional array with sixteen indices that are either 0 or 1" is likely to elicit suggestions which are not helpful, even though this also describes a "64K array". If you're mapping an array of 16 bit values to an array of 8 bit values through a lookup table, just say so. >> I have a question for you: does %es refer to conventional >> memory (so you can access the VGA frame buffer), or is all of >> this taking place in PM memory? badcoe> Do you mean %esi ? It's in PM memory, I think (in badcoe> retrospect) that all my references to RM and PM were badcoe> spurious. Actually I meant %es, as in the selector. By changing %es to refer to conventional memory you could write directly to the VGA board and maybe avoid an extra copy. I was wondering if you were using stosb to get a %es override "for free", or just to get a postincrement of %edi. >> I noticed a few efficiency problems with your code, badcoe> What were they ? Pentium and i486 things, like AGI stalls and use of non-pairable instructions. I wrote up a specific list of inefficiencies, but I don't remember if I posted it or not. I don't know how to optimize your code for the i386 in particular. -Mat