Xref: news2.mv.net comp.os.msdos.djgpp:1717 From: korpela AT islay DOT ssl DOT berkeley DOT edu (Eric J. Korpela) Newsgroups: comp.os.msdos.djgpp Subject: Re: ASM code & Random Date: 7 Mar 1996 01:47:22 GMT Organization: Cal Berkeley-- Space Sciences Lab Lines: 57 Message-ID: <4hlf7a$ofg@agate.berkeley.edu> References: NNTP-Posting-Host: islay.ssl.berkeley.edu To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp In article , wrote: >A9>This is the code: > >asm(" > pusha > movl $0xa0000,%edi > movl _virt,%esi # virt declared somewhere > movl $32000,%ecx > movw _dos_seg,%es > cld > rep movsw > popa >") > > How's this for more speed? The REP command will repeat the >string command immediately following it (MOVSW). I also shortened all >those push/pop operations into one pusha/popa command.. like 6 less >operations plus the loop should be a bit faster.. Some points to consider..... You shouldn't be using "rep movsw" you should be using "rep movsl" Moving 32 bit chunks will probably be faster than 16 bit chunks even over an ISA channel. On a 486 "rep movsl" should be the fastest method, but not so on a pentium. On a pentium use a load store loop with 2 32bit loads and 2 32bit stores like... (Don't quote me on syntax here. I'm no genius when I don't have a reference handy.) It's a bit harder with far pointers, but can be done. (put the segment override byte in front of the last two movl's) I think segment overrides cost cycles, though. leal _virt,%esi leal _vid_nearptr,%edi movl 8191,%ecx loop1: movl (%edi,%ecx,8),%edx movl 4(%edi,%ecx,8),%eax movl %edx,(%esi,%ecx,8) movl %eax,4(%esi,%ecx,8) decl %ecx jge loop1 On a final not, the pusha and popa don't save any time. It still takes a cycle per longword pushed, and can't be paired. Instead, let gcc know which registers are used, and let it save them if necessary. There's not much point in pushing %esi if it only held garbage that gcc was going to discard anyway. Look up the syntax of asm to see how it's done. Eric On a 486, the "rep mov -- Eric Korpela | An object at rest can never be korpela AT ssl DOT berkeley DOT edu | stopped. Click here for more info.