Date: Wed, 24 Sep 1997 11:33:16 +0300 (IDT) From: Eli Zaretskii To: Jan Hubicka cc: djgpp AT delorie DOT com Subject: Re: Strange benchmark results In-Reply-To: <19970922141413.23845@horac.ta.jcu.cz> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Precedence: bulk On Mon, 22 Sep 1997, Jan Hubicka wrote: > BTW what hardware/cache do you use? The cache used was that of Windows 95. Its size changes dynamically, as Windows sees fit. > I think that number of > syscalls is sysgnificant. In EMX it is three syscalls - open,buffered write, > close. In DJGPP it is 102 syscalls :) Surely write is not buffered. But > in dos, where file becomes visible once it is closed, it don't > brother whether write call is buffered or not. My testing indicates that the problem is the amount of calls to `__dpmi_int'. I have compared your original program, which does 100,000 writes of 1 byte, with a slightly modified one, that does 10,000 writes of 700 bytes. The following timings were taken on a P166 with plain DOS 5.0 and a large SmartDrv cache: 1-byte writes 700-byte writes Turbo C 2.0 2.1 sec 2.4 sec DJGPP 2.01 6.8 sec 3.1 sec Observe how, with DJGPP, the second program wrote 70 times more bytes to the disk than the first one (7MB as opposed to 100KB), but actually took only half the time to do that! With TC, the times are roughly equal. The reason is IMHO obvious: the first program did 100,000 calls to `__dpmi_int' and mode switches, while the second only did it 10,000 times. This 10-fold difference in the mode switches is the main reason for the slowness in the first case. So my conclusion is that, at least in the case of real-mode compilers such as TC and BC, `write' is NOT buffered; the reason for the DJGPP performance hit is the mode switch that eats up a lot of CPU cycles. I don't know what happens in EMX/RSX, but it is possible that `write' isn't buffered there either. It is possible that the extender (not the application) buffers the writes and only delivers them to DOS in large chunks, whereas DJGPP is an extenderless environemnt. In any case, I'm not sure whether optimizing 1-byte unbuffered writes is of any practical value. If you think it is, feel free to submit changes to libc functions to DJ Delorie . If you want to compare the speed of real-mode interrupts (as your original message indicates), I suggest using a service where buffering is not an issue at all, such as some simple BIOS service.