www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/10/19/04:28:26

Date: Mon, 19 Oct 1998 10:26:29 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
To: Ludvig Larsson <ludvig AT club-internet DOT fr>
cc: djgpp AT delorie DOT com
Subject: Re: superslow simpel rep stosl, why?
In-Reply-To: <362A6AFE.599F@club-internet.fr>
Message-ID: <Pine.SUN.3.91.981019102612.7874P-100000@is>
MIME-Version: 1.0
Reply-To: djgpp AT delorie DOT com

On Mon, 19 Oct 1998, Ludvig Larsson wrote:

> On my AmdK6-2 300mhz it takes 0.006 sec. which gives about
> 100millions of bytes/sec. Quite a bit right!
> But should it take 3 clockcykles to clear each byte?
> I'm clearing quadwords...
> 
> I'm using asm(rep stosl).
> 
> Is this normal?

Why not?  On a 486 STOSD is documented to require 5 clocks per move,
so it doesn't strike me as terribly wrong to get 3 clocks on K6.  Keep
in mind that it doesn't just move the dword, it also increments a
pointer and decrements a count as it goes.

> Is it a sort of cache thing?(my amd has 64kb on-chip,
> 512 kb burst and the rest is 100mhz memory).

Writing to memory isn't usually affected by the cache, especially if
you write more than the cache size (600k as opposed to 512k).  Try
halving the buffer size and see if the timing per dword changes.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019