From: Kovacs Viktor Peter Newsgroups: comp.os.msdos.djgpp Subject: Re: MISSION: Making world´s fastest pixel drawing possible Date: Thu, 18 Jan 2001 12:34:35 +0000 Organization: Budapest University of Technology and Economics Lines: 30 Message-ID: References: <942h1e$5qq$1 AT tron DOT sci DOT fi> NNTP-Posting-Host: winnie.obuda.kando.hu Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: goliat.eik.bme.hu 979820323 9112 193.224.41.10 (18 Jan 2001 12:18:43 GMT) X-Complaints-To: abuse AT bme DOT hu NNTP-Posting-Date: 18 Jan 2001 12:18:43 GMT In-Reply-To: <942h1e$5qq$1@tron.sci.fi> To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by delorie.com id IAB20538 Reply-To: djgpp AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > Contents: > - Fastest possible way to draw pixel´s (well, maybe the near pointer hack > mentioned in DJGPP FAQ is faster :)). Drawing developed as a macro (becourse > jumping to function and back again takes more time than drawing the damn > pixel !). Screen coordinates pre-calculated in the global table. > > Feedback needed about the following subjects: > 1) Is it necessary to save ES segment register (the macro currently does not > do it, but it seems to work fine, no crashes, etc...) ? > 2) Is the assembly command "les %0,%%edi" faster than the currently used > "movw %0,%%es;movl %1,%%edi" pair ? Have you heard about the far ptr hack in djgpp? It even allows loading the segment outside a tight loop... (It was designed to work with systems that don't allow segment limits to be set to 4GB.) About your questions: -yes, you should save ES, because someone (memcpy) might use it... -les: On the P4 it is slower, on other systems it is about the same -precalculated tables: when you use integer math only, it is faster to calculate it on the fly, because you can spare a few TLB and cache entries with it (the overall speed will be faster) -tables are to be used instead of float math...