Date: Tue, 21 Mar 95 15:09 MST
From: mat AT ardi DOT com (Mat Hostetter)
To: DJGPP AT sun DOT soe DOT clarkson DOT edu
Subject: Re: A quick way to copy n bytes
References: <5562970E5 AT fs2 DOT mt DOT umist DOT ac DOT uk>

>>>>> "A" == A Appleyard <A DOT APPLEYARD AT fs2 DOT mt DOT umist DOT ac DOT uk> writes:

A> void str_cpy(void*s,void*t,int n){
A> asm("pushl %esi"); asm("pushl %edi"); asm("cld");
A> asm("movl 8(%ebp),%edi"); asm("movl 12(%ebp),%esi");
A> asm("movl 16(%ebp),%ecx"); asm("rep"); asm("movsb"); asm("popl %edi");
A> asm("popl %esi");}

First of all, this is a glaringly incorrect use of gcc inline asm and
you're lucky it works at all.  You never tell gcc what registers you
are clobbering, and you split it up into separate asm statements so
gcc is free to clobber any and all registers between asm statements if
it wants to do so.

Second, the V1 memcpy does this anyway (uses rep ; movsb).  memcpy is
written in assembly.

Third, the memcpy in the djgpp V2 alpha does aligned long-at-a-time
transfers, and will be faster than your routine for large copies.

If you really wanted to write this function, you would write ONE asm
statement like:

void
my_memory_copy (void *dst, void *src, int count)
{
  asm volatile ("cld\n\t"
		"rep\n\t"
		"movsb"
		: : "S" (src), "D" (dst), "c" (count)
		: "si", "di", "cx", "memory");
}

Of course, if you don't inline this function it ends up being
essentially the V1 memcpy.  So why not just use the libc function?

-Mat