www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1994/11/07/08:09:43

Date: Mon, 7 Nov 1994 10:18:42 +0100
From: Kim Jersin <u940422 AT daimi DOT aau DOT dk>
To: djgpp AT sun DOT soe DOT clarkson DOT edu
Cc: u940422 AT daimi DOT aau DOT dk
Subject: Fast disk I/O in DJGPP programs.

To all in need of reading large amounts of data... fast!

Last week there was some mail about poor disk I/O performance under the 
go32 dos extender. The conclusion to the problem was that go32 needs to 
switch to real mode when it needs data from any DOS file and/or device 
and it has to do that a lot of times for large data transfer processes 
due to a small transfer buffer. That is not only a problem for go32, 
but for any dos extender. The task swithcing process takes some time to 
accomplish, because all registers has to be saved, reloaded with real 
mode values, a new stackframe has to be set up, etc. and all this has 
to be reversed on return.

I have come up with a solution that minimizes the problem (please take 
a look at the two benchmark tests below), at least on configurations 
with plenty of RAM at hand (8Mb or more). I hope that some of you will 
take the time and do some testing, especially the one who made the 
benchmarks tests under different dos extendes (sorry I can''t rember 
your name, my recieved mailboks with all my old mail has wanished - 
novice on Unix platform).

The solution that I have sketched out, in the source code following the 
benchmark tests, uses a small TSR (gpphlp.asm) wich must be installed 
prior to running go32 and also before starting Windows if you prefere 
to run your programs in a windows dos box (wich actually gives better 
performance to the go32 extended program, when using with 32bit disk and 
file access turned on, than when run from dos using the QEMM DPMI 
manager).

The TSR does as much continues reading as the availible dos memory 
allows (up to aprox 448Kb under windows) in 32Kb blocks, and the 
processor remains in real mode for the whole duration (all 448Kb). The 
control then goes back to the extended program wich copies the dos 
memory contents into whatever buffer was requested by the calling 
function. The TSR is then called again for the next chunck of data 
etc... until the number of requested bytes has been read.

The entire control lays within the contens of the high level function 
HugeRead(). Thus error checking etc. is a troublesome thing to do in 
assembler. And the goal is also to reduce the memory requirements by 
the TSR to a minimum. At the moment this is 448 bytes wich will 
increase by some few hundred bytes when fully equiped with both read 
and write. But on the other hand the TSR is currently held in an .EXE 
file. If coded into a .BIN or .SYS file and loaded in the config.sys 
file the size can be reduced by some 200 bytes.

There is a small catch:
It only runs on platforms where DPMI sevices are available (most memory 
managment utilities does fortunately this these days). That's because 
dos memory allocation/free is only done when needed and from the 
extended program.

I use the term sketch when talking about the two programs. This means 
that what is done is done and nothing more for the moment (eg. you can 
read data but not write) and it should work as expected, but of course 
there is no warranty and at your own risk. I also hope that it could 
lead to a discussion of perhaps incorporating it into future realaeses 
of the DJGPP package or more generally having a small real mode based 
helper TSR to help minimize task switching when using real mode 
facilities.

It would be to fun finish it if I had the time. If you have the time to 
wait till the end of the year (I have some really serious study comming 
up at the Univesity) and if there is an interrest for some of the ideas 
mentioned here then I would be happy to round of the cornes, eg. the 
method of using a fixed interrupt is very fast to code but bounches if 
someone else also uses it. Perhaps writing a specialized streambuf for 
use in the iostream libraray etc. you name it.

Comments please..

---
Kim Jersin.

Benchmarks:
-----------
The machine used is an i386/387 33mhz clone with 8Mb RAM and a 405Mb 
Conner AT-bus hard disk. The reading was done on a single continues 
file.

Norton sysinfo states aprox 1150 Kbytes/Sec in continues read. So the 
90Kb (Windows) and 200Kb (QEMM DPMI) the two examples read less a 
second is proberbly due to the time it takes copying from DOS (real 
mode) memory to go32 application memory.

Both benchtest was done from a fresh booted machine to prevent any 
chached data (especially the windows 32bit disk and file access 
chaching) from influence on the results. Both test was done on the same 
configuration (same autoexec.bat, config.sys files), eg. the QEMM DPMI 
manager was also present when windows was started (but properbly taken 
over by windows).

The two tables is redirected output as i would appear on screen.

In WFW 3.11 dos box:
--------------------
Trying to read 2097152 bytes
Allocate 0xFFFF:
        return: 8
        Paragraphs available: 30611

Paragraphs to allocate: 28672
Reading 458752 bytes, copying to memory above 1Mb.
Reading 458752 bytes, copying to memory above 1Mb.
Reading 458752 bytes, copying to memory above 1Mb.
Reading 458752 bytes, copying to memory above 1Mb.
Reading 262144 bytes, copying to memory above 1Mb.
Free dos mem return: 0
Bytes read: 2097152
Time elapsed: 1.92258
KBytes pr. second: 1065.23

QEMM DPMI manager:
------------------
Trying to read 2146304 bytes
Allocate 0xFFFF:
        return: 8
        Paragraphs available: 24786

Paragraphs to allocate: 24576
Reading 393216 bytes, copying to memory above 1Mb. 
Reading 393216 bytes, copying to memory above 1Mb. 
Reading 393216 bytes, copying to memory above 1Mb. 
Reading 393216 bytes, copying to memory above 1Mb. 
Reading 393216 bytes, copying to memory above 1Mb. 
Reading 180224 bytes, copying to memory above 1Mb. 
Free dos mem return: 0
Bytes read: 2146304
Time elapsed: 2.19724
KBytes pr. second: 953.924


The C++ high level routines:
----------------------------
When compiling please be sure to include the iostream library 
(libiostr.a).

This code is where the must improvement of usability is to be done. I 
hope that it is written in such a way that it is possible to pick out 
the essense of it if the need is to use it (rewrite).

#include <sys/types.h>
#include <djgppstd.h>
#include <stdio.h>
#include <iostream.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <dos.h>
#include <go32.h>
#include <dpmi.h>
#include <gppconio.h>
#include <time.h>

int HugeRead( int fHandle, void *Buf, int Count )
{
    // Allocate dos memory to be used as transfer buffer.
    // The size is the largest continuing buffer divisible by 32kb.
    // -- 32kb is the block size GppHlp uses for reading.
    _go32_dpmi_seginfo info;
    memset( &info, 0, sizeof(info) );
    info.size= 0xFFFF;  // Try to allocate more than is available
    cout << "Allocate 0xFFFF:\n";
    cout << "   return: " << _go32_dpmi_allocate_dos_memory(&info) << "\n";
    cout << "   Paragraphs available: " << info.size << "\n";
    
    info.size= (info.size/0x800)*0x800; // Paragraphs to allocate
    int BufSize= info.size<<4;          // Size more convienent
    cout << "\nParagraphs to allocate: " << info.size << "\n";

    int SizeRead= 0;
    if( info.size!=0 && _go32_dpmi_allocate_dos_memory(&info)==0 )
    {
        _go32_dpmi_registers r;
        memset(&r, 0, sizeof(r));
        r.x.bx= (u_short) fHandle;
        r.x.dx= 0;
        r.x.ds= info.rm_segment;
        
        // Read as much from the file as DOS memory allows, and copy it to
        // the destination buffer, in each loop.
        r.x.ax= 1;
        for(; ((int)r.d.eax)>0 && Count>0; Count-=r.d.eax )
        {
            if( BufSize>Count )
                r.d.ecx= (u_long) Count;
            else
                r.d.ecx= (u_long) BufSize;
            cout << "Reading " << r.d.ecx << " bytes";
            
            r.x.ax= 0x213F;
            _go32_dpmi_simulate_int(0x65, &r);
            cout << ", copying to memory above 1Mb. \n";
            
            if( r.d.eax> 0 )
            {
                // Copy the read data from DOS mem to the buffer
                dosmemget(((u_long)info.rm_segment)<<4, r.d.eax, Buf);
                Buf+= r.d.eax;
                SizeRead+= r.d.eax;
            }
        }
        
        // Cleanup
        cout << "Free dos mem return: " << _go32_dpmi_free_dos_memory(&info)
             << "\n";
    }
    return SizeRead;
}   
    
int TestHugeRead()
{
    int BytesRead= 0;
    int data;
    data= open( "data", O_RDONLY );
    if( data )
    {
        int BufSize;
        cout << "Number of Kbytes to read: ";
        cin >> BufSize;
        BufSize= BufSize<<10;
        cout << "\nTrying to read " << BufSize << " bytes\n";
        char *Buf= (char*) malloc(BufSize);
        if( Buf )
        {
            clock_t EndClock, StartClock= clock();
            if( (BytesRead= HugeRead(data, Buf, BufSize))>=0 )
            {
                // Some statistics
                EndClock= clock();
                double TimeElapsed= ((double)EndClock- 
                                    (double)StartClock)/CLOCKS_PER_SEC;
                cout << "Bytes read: " << BytesRead << "\n";
                cout << "Time elapsed: " << TimeElapsed << "\n";
                cout << "KBytes pr. second: " << BytesRead/1024/TimeElapsed << "\n";
            }
            else
                cerr << "Error reading <data>\n";
            free(Buf);
        }
        else
            cerr << "Error allocation buffer\n";
        close(data);
    }
    else
        cerr << "Error opening <data>\n";
        
    return BytesRead;
}

const char GppHlpStr[]= "GPPHLP";
int main()
{
    int Res= 0;
    // Make sure the GppHlp int vector has been set
    _go32_dpmi_seginfo iv;
    _go32_dpmi_get_real_mode_interrupt_vector(0x65, &iv);
    if( iv.rm_segment!= 0 || iv.rm_segment!= 0)
    {
        // Check for the existens of the GppHlp driver
        char CheckStr[7];
        _go32_dpmi_registers r;
        memset(&r, 0, sizeof(r));
        r.x.ax= 0x6500;
        _go32_dpmi_simulate_int(0x65, &r);
        if( r.x.ax== 0x0065 )
        {
            dosmemget( r.d.ebx, sizeof(CheckStr), CheckStr );
            if( strcmp(GppHlpStr, CheckStr)== 0 )
                TestHugeRead();
            else
                Res= 3;
        }
        else
            Res= 2;
        
        //
        if( Res!= 0 )
        {
            cerr << "Error: GppHlp was not installed on INT 0x65.\n";
            cerr << "       But something else was.....\n";
        }
    }
    else {
        cerr << "Error: The GppHlp interrupt vector was not set (0000:0000)\n";
        Res= 1;
    }
    
    return Res;
}

The assembler TSR:
------------------
You need borlands TASM assembler to assemble. Please allow the 
assembler to do multipass (the /M# switch) assembling to be able to 
resolve the conditional jumps going furter than 127 bytes.

Remember to install (excute) before running the C++ program or before 
running windows.

        IDEAL
        P386N           ; Allow the use of 386 instructions
        JUMPS           ; Resolove conditional jumps going further than 
                        ; 127 bytes
        
; Defines
INTHANDLE = 65h         ; The interrupt used for HLP<=>GPP communication
WRONGINT = 1            ; Error code returned on error when int is used

SEGMENT DSEG    WORD 'DATA'

; Importened values
MinMem  DW      ?

ENDS    DSEG

SEGMENT INTSEG PARA 'CODE'
        ASSUME  cs:INTSEG
        ASSUME  ds:NOTHING,es:NOTHING

GppHlpStr DB    "GPPHLP",0

;
; Entry point of the interrupt service
; ------------------------------------
; Available services:
;       AX= 6500h       - Existens check
;           Call this function to check if the interrupt is installed by
;           this program. First check the contens of AX and if ok then
;           compare the string pointed to by EBX.
;           Return: AL= 65h
;                   AH= 0       
;                   EBX= Linear address of "GPPHLP" string.
;       AX= 213Fh       - Huge read from file or device
;           Like dos function 3Fh, except that it is able to read into a
;           huge buffer (larger than 64Kb-1).
;           BX= File handler
;           ECX= Number of bytes to read
;           DS:DX= Linear address of buffer
;           Return: EAX= Number of bytes read or error code if the value
;                        is negative (take the ABS() and you have the 
;                        dos error code as returned by int 21h AH=3Fh).
;           Destroyed: EDI ESI, the rest is preserved.
;
PROC    RealModeHlp FAR
        ; Make primary function selection
        cmp     ah,21h
        je      @@Dos
        cmp     ah,65h
        je      @@GppHlp
        jmp     @@Chain

@@GppHlp:
        ;-- GppHlp specific functions --
        cmp     al,0
        jne     @@Chain
        
        ; Return information that says "alive and well".
        xchg    al,ah
        xor     ebx,ebx         ; Clear
        mov     bx,cs           
        shl     ebx,4                   ; Calculate the 
        add     ebx,OFFSET GppHlpStr    ; linear address
        jmp     @@End
        
@@Dos:
        ;-- Dos functions extended by this TSR --
        push    OFFSET @@End    ; Inforce a near return frame to the 
                                ; "one point out" exit.
        cmp     al,3Fh
        je      Dos3F
        ;-- Add additional dos extensions here --
        
        ; Not a valid dos extension -
        ; remove the not needed return address and exit this TSR
        add     sp,2
        jmp     @@Chain

@@Chain:
        ; If coded correctly this would include a call (or jump) to the next
        ; int handler in the chain (the one installed prior to this TSR).
        iret
        
@@End:
        ; We inforce all returns from valid functions to go through this label,
        ; so that any generel cleanup can be done at this point.
        iret
ENDP    RealModeHlp

;
; Huge read from file or device, using a handler:
; -----------------------------------------------
PROC    Dos3F   NEAR
        ASSUME  ds:NOTHING,es:NOTHING
        push    DS ES
        
        xor     eax,eax         ; We havent read any bytes yet
        or      ecx,ecx
        jz      @@End   
        
@@Read:
        push    eax ecx
        push    bx dx ds        ; We don't rely on DOS preserving these
        cmp     ecx,8000h
        jb      @@Less32Kb
        mov     cx,8000h
@@Less32Kb:
        mov     ah,3Fh
        int     21h
        movzx   edi,ax          ; Store the read result
        pop     ds dx bx
        pop     ecx eax
        jc      @@Error

        or      di,di
        jz      @@End           ; If nothing read than return
        add     eax,edi         ; The new read total
        sub     ecx,edi         ; decreament the counter
        jz      @@End           ; ..and return if it reaches zero
        mov     di,ds
        add     di,800h
        mov     ds,di           ; Move the buffer pointer (DS:DX) 32Kb
        jmp     @@Read

@@Error:
        xor     eax,eax
        sub     eax,edi         ; The error return code
@@End:
        pop     es ds
        ret
        
@@ToEOF:
        add     ecx,8000h
        jmp     @@Read
ENDP    Dos3F

ENDS    INTSEG

; **************************************************************************
; The segments past this point won't be valid after the program goes resident.
; Eg. All the instalation code and the stack is thrown away and only the parts
; needed to execute the GPP requests will remain in memory.
;

SEGMENT CSEG PARA 'CODE'
        ASSUME  cs:CSEG
        ASSUME  ds:DSEG,es:NOTHING

start:
        mov     ax,DSEG
        mov     ds,ax           ; Initialize the the data segment
        mov     [PSP],es        ; The PSP address

        ; Calculate how much memory is used by this program
        mov     ax,es           ; Start of the program
        mov     bx,CSEG         ; End of last segment used
        sub     bx,ax
        inc     bx
        mov     [MinMem],bx     ; Memory needed, in paragraphs
        
        ; Release memory not used by this program
        mov     es,[es:2Ch]      ; Segment of environment string
        mov     ah,49h
        int     21h
        
        ; Make sure the interrupt handler we want to use ain't taken allready
        mov     ah,35h
        mov     al,INTHANDLE
        int     21h
        mov     ax,es
        or      ax,bx
        jnz     @@IntError
        
        ; Install our interrupt handler
        ; -- Please notice that no chaining is done, this should be improved
        ; -- if you plan to use this facility in the future.
        ; -- This is NOT the right way. The services should be installed
        ; -- using the Multiplex interrupt (INT 2Fh).
        push    ds
        mov     dx,INTSEG
        mov     ds,dx
        mov     dx,OFFSET RealModeHlp
        mov     ah,25h
        mov     al,INTHANDLE
        int     21h
        pop     ds

        ; Terminat but stay resident
        mov     ax,3100h        ; TSR command and 0 as return code
        mov     dx,[MinMem]
        int     21h

@@IntErrorMsg:
        DB      "Error installing GPPHLP.EXE.",13,10
        DB      "The needed interrupt is used by someone else...",13,10,'$'

@@IntError:
        ; Display error message and return to dos
        push    ds
        mov     ax,cs
        mov     ds,ax
        mov     dx,OFFSET @@IntErrorMsg
        mov     ah,9
        int     21h
        pop     ds
        
@@Exit:
        mov     ah,4Ch
        mov     al,WRONGINT
        int     21h

ENDS    CSEG

SEGMENT SSEG PARA STACK 'STACK'
        ; 256 bytes stack should be enough.
        ; Notice that the stack is only used during initialization. The stack
        ; that is used when called from a GPP program is supplied by the
        ; dos extender so this stack gets obsolute as soon as the program
        ; goes resident.
        DW      50h DUP (?)             
ENDS    SSEG

        END     start

---

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019