Date: Thu, 25 May 1995 12:27:27 +1000 (EST) From: "Junaid A. Walker" Subject: Re: Compiling gdb, dpmi DS:VRAM hack. To: Charles Sandmann Cc: mat AT ardi DOT com, DJGPP Mailing List On Wed, 24 May 1995, Charles Sandmann wrote: > Forwarded message: > To: junaid AT barney DOT eng DOT monash DOT edu DOT AU (Junaid A. Walker) > Date: Tue, 23 May 1995 19:49:16 -0500 (CDT) > Cc: djgpp AT sun DOT soe DOT clarkson DOT edu (DJGPP Mailing List) > > > Finally here's a hack to make conventional memory accessible > > to DS under dpmi - Sure hope that the V2.0 cswdpmi emulator supports > > this hack.... use 'gcc f.c -lpc' to compile. > > Yes, this should work just fine under cwsdpmi or V2 as it works under > Win 3.1 *BUT*: > > 1) This code will work in certain circumstances but not others. The > algorithm is bugged in that it assumes the base of DS will not change. > This may not be true if you call malloc() or any libc routine which > calls malloc(). You must recompute the address if sbrk() gets called, > and reset the limit. Quite wrong, why dont you try it? sbrk() and brk() modify the size of DS. So the remapped VRAM address *doesnt* need to be recomputed, but the fat-DS must be reloaded. As to any system call modifying the physical base address of DS, how can this be so? All global variable accesses will fail to work, unless the memory image is 'magically' patched. Along the same lines, dj mentioned that he cannot guarentee that an the selector in the go32 conventional memory selector field will be constant for the life of the process. This surely must be dubious statement. If we assume that the OS magically modifies (ie remaps) the location of conventional memory in the processes virtual address space, how can the process ever modify conv mem? Surely if the process can have no coherent understanding of where conv mem is, then only an OS call can accomplish access. There is no such DPMI call (even in v1.0) hence this behaviour is impossible. Hypothetically one would need a callback structure to communicate with the OS about these 'asyncronous events' and disable remapping for the duration of conv mem access. Bissare...and not part of DPMI (or any sensible OS including Linux, Win OS/2 etc). Any why bother? If a process has privaledges to access conv mem, then those right should be maintained for the life of the process, or until the process relinquishes. Neither is there any reason to 'recycle' virtual address space...4Gbyte is plenty. I dont think any OS's use the 4Tbyte model, which might mean that unmapped memory is free to float around. Maybe in 50 years when we all have 4GigaBytes of RAM and 4TeraBytes of disk. But then you and i will be no more, and DOS.....well who's hear of DOS or 80xx86? BTW the hack seems to work with OS/2, qdpmi101 (a wonder anything works under qdpmi), and most probably everything else. Would people please test the program under different DPMI providers? Eg 386Max, Win NT, OS/2 2.x, OS/2 Warp, Linux DOS emulator, etc. Presumably all dpmi providers add 1 to the segment limit, to get the segment size, and with 32-bit ints this invariably wraps around to 0 which is permissible (=>disable segment access?). So the hack is going to work for a long time to come, provided dpmi servers dont get smart and test for a segment limit of -1. > 2) Not all DPMI providers allow you to set limits which span memory you > do not own. In this case this algorithm does not work at all. Correct, a segment limit of -2 doesnt work, see above. Even so there is still probably an OS dependent way to get these privaledges. > 3) As noted, this bypassses all memory protection under Windows, so a > bad pointer will crash Win instead of giving a GPF traceback. > > Depends, you really need to try accessing a NULL ptr etc. If these regions arent mapped to any virtual memory, then a GPF will still occur. I'm not really sure on this point (since i didnt program windows and friends!). Probably if we want to hack around this, then fiddle with the debug registers to protect the restricted regions; 0x00000000=======VRAM=====CODE====DATA=======0xffffffff |_______________| |___________| |_______________| region 1 region 2 region 3 Ie three regions (two if the debug limits wrap around!). So we still have at least one debug break point to do hardware debugging with. There are also dpmi calls to roll your own segments...but i dont think these would be of help here. Probably it would still be more simpler and more 'portable' (hmm? DOS, ...nahh!!) to use the normal DS, and load the fat-DS just for the duration of conv mem access. Thanks to everyone that has replied to me, hope this hack is useful to you. Just for those of you who might have missed out.... Junaid Walker. CUT HERE ----------------------------------------------- /* Well here it is; how to access video ram at 0xa0000 without using a segment overide (ie using the current process's DS). Works well under MS Windows 3.1 in a full screen DOS box. Try running this app in the background (DOS settings 'Runs in Background' crossed).Then switch away from this app (CTRL-ESC) and run any other program. Windows will remap the video ram at 0xa0000 to a temporary system ram buffer while this app runs minimized; hence this app will run faster when minimized. The screen will be correctly restored when we switch back to this app. This program works by removing the segment limit on DS and using 32-bit wrap around in the linear address space to access memory at linear address 0xa0000. In fact we now have access to all memory including DOS conventional memory. This circumvents all memory protection (except unmapped memory), so best to save the original DS for normal use, and use the new unlimited DS for the duration that video memory access is needed. Compile with 'gcc prog.c -lpc' and then 'go32 prog'. You should see some blue lines more down the screen for a while, after that it will display the 16-bit speed of your video card. Al-Junaid Walker 22/5/95. */ #include #include #include #include #include #include #include #include #include #define LINEAR /*use conventinal mem descriptor mapped into linear memory*/ /*otherwise do it the recommended go32 way*/ #define BLIT_SIZE 32000U /*# bytes to blit to screen each iteration*/ /* pal.bin binary file converted by BIN2C */ unsigned char palette[] = { 0,0,0,0,0,0,0,0,1,0,0,3,0,0,4,0, 0,6,0,0,8,0,0,9,0,0,11,0,0,12,0,0, 14,0,0,16,0,0,17,0,0,19,0,0,20,0,0,22, 0,0,24,0,0,25,0,0,27,0,0,29,0,0,30,0, 0,32,0,0,33,0,0,35,0,0,37,0,0,38,0,0, 40,0,0,41,0,0,43,0,0,45,0,0,46,0,0,48, 0,0,50,0,1,50,0,3,50,0,4,51,0,6,51,0, 7,52,0,9,52,0,10,52,0,12,53,0,14,53,0,15, 54,0,17,54,0,18,54,0,20,55,0,21,55,0,23,56, 0,25,56,0,26,56,0,28,57,0,29,57,0,31,58,0, 32,58,0,34,58,0,35,59,0,37,59,0,39,60,0,40, 60,0,42,60,0,43,61,0,45,61,0,46,62,0,48,62, 0,50,63,1,50,63,3,50,63,5,51,63,7,51,63,9, 52,63,11,52,63,13,52,63,15,53,63,17,53,63,19,54, 63,21,54,63,23,54,63,25,55,63,27,55,63,29,56,63, 31,56,63,33,56,63,35,57,63,37,57,63,39,58,63,41, 58,63,43,58,63,45,59,63,47,59,63,49,60,63,51,60, 63,53,60,63,55,61,63,57,61,63,59,62,63,61,62,63, 63,63,63,62,63,63,61,63,63,60,63,63,58,63,63,57, 63,63,56,63,63,54,63,63,53,63,63,52,63,63,50,63, 63,49,63,63,48,63,63,46,63,63,45,63,63,44,63,63, 42,63,63,41,63,63,40,63,63,39,63,63,37,63,63,36, 63,63,35,63,63,33,63,63,32,63,63,31,63,63,29,63, 63,28,63,63,27,63,63,25,63,63,24,63,63,23,63,63, 21,63,63,20,63,63,19,63,63,18,63,63,16,63,63,15, 63,63,14,63,63,12,63,63,11,63,63,10,63,63,8,63, 63,7,63,63,6,63,63,4,63,63,3,63,63,2,63,63, 0,63,63,0,62,63,0,61,63,0,60,63,0,58,63,0, 57,63,0,56,63,0,54,63,0,53,63,0,52,63,0,50, 63,0,49,63,0,48,63,0,46,63,0,45,63,0,44,63, 0,42,63,0,41,63,0,40,63,0,39,63,0,37,63,0, 36,63,0,35,63,0,33,63,0,32,63,0,31,63,0,29, 63,0,28,63,0,27,63,0,25,63,0,24,63,0,23,63, 0,21,63,0,20,63,0,19,63,0,18,63,0,16,63,0, 15,63,0,14,63,0,12,63,0,11,63,0,10,63,0,8, 63,0,7,63,0,6,63,0,4,63,0,3,63,0,2,63, 0,0,63,0,0,62,0,0,61,0,0,60,0,0,59,0, 0,58,0,0,57,0,0,56,0,0,55,0,0,54,0,0, 53,0,0,52,0,0,51,0,0,50,0,0,49,0,0,48, 0,0,47,0,0,46,0,0,45,0,0,44,0,0,43,0, 0,42,0,0,41,0,0,40,0,0,39,0,0,38,0,0, 37,0,0,36,0,0,35,0,0,34,0,0,33,0,0,32, 0,0,31,0,0,30,0,0,29,0,0,28,0,0,27,0, 0,26,0,0,25,0,0,24,0,0,23,0,0,22,0,0, 21,0,0,20,0,0,19,0,0,18,0,0,17,0,0,16, 0,0,15,0,0,14,0,0,13,0,0,12,0,0,11,0, 0,10,0,0,9,0,0,8,0,0,7,0,0,6,0,0, 5,0,0,4,0,0,3,0,0,2,0,0,0,0,0,0, }; void Sync(void) { while (inportb(0x3da) & 8); while (!(inportb(0x3da) & 8)); } void SetCols(void) { int c; outportb(0x3c8,0); for (c = 0; c < 256*3; c++) outportb(0x3c9,palette[c]); } static union REGS regs; static unsigned char *VRAM; static unsigned short ds_desc; static unsigned char frame_buf[64000]; void text_mode(void) /*return to a cleared text mode screen*/ { regs.x.ax = 0x3; /*back to text mode*/ int86(0x10,®s,®s); } void barf(char *msg) /*ERROR MESSAGE WHILE IN GRAPHICS MODE*/ { text_mode(); cprintf("\r\n%s\r\n", msg); exit(1); } void winge(char *msg) /*ERROR MESSAGE WHILE IN TEXT MODE*/ { cprintf("\r\n%s\r\n", msg); exit(1); } static unsigned get_limit(void) /*get segment limit of %ds */ { unsigned l; __asm__( "movl $0,%%eax\n\t" /*zero out high word*/ "movw %%ds,%%ax\n\t" /*get %ds into low word of %eax*/ "lsll %%eax,%%ebx\n\t" /*get segment limit of %ds into %ebx*/ "jz 1f\n\t" /*lsl successful*/ "movl $0, %%ebx\n\t" /*lsl unsuccessful => return bad limit of 0*/ "1:\n\t" :"=b" (l) : /*no input*/ :"ax", "bx", "cc" ); return l; } void RAM_TO_VRAM(unsigned char *dest, const unsigned char *src, int n) /*Write _n bytes (word aligned) to VRAM address _dest, from system RAM _src.*/ { __asm__("cld\n\t" "testw $1,%%di\n\t" /*_dest word unaligned?*/ "je 1f\n\t" /*even so word aligned*/ "movsb\n\t" /*odd, so move a byte to make _dest even and word aligned*/ "dec %%ecx\n\t" /*update count*/ "je 2f\n\t" /*no more to do*/ "1:\n\t" /*_dest is now aligned*/ "shrl $1,%%ecx\n\t" /*get a word count, carry=1 if a byte left over*/ "rep ; movsw\n\t" /*blit as many words as possible*/ "adcl %%ecx,%%ecx\n\t" /*get the left over byte*/ "rep ; movsb\n\t" /*move the left over byte (if any)*/ "2:\n\t" : /* no output */ :"c" ((long) n),"D" (dest),"S" (src) : "cx","di","si","memory","cc"); } int main(void) { unsigned duration, blits, vram_off; /* HERE'S A LITTLE CODE FRAGMENT OFF USENET LONG AGO... mov cx,1 ; allocate one descriptor mov ax,0 ; function to allocate descriptor int 31h ; request dpmi services jc DPMI_Error ; function failed? mov bx,ax ; move desriptor to bx xor cx,cx ; xor dx,dx ; set descriptor base to 0x00000000 mov ax,7 ; DPMI set descriptor base function int 31h ; call dpmi jc DPMI_Error mov cx,-1 ; mov dx,-1 ; set descriptor range to 0xffffffff mov ax,8 ; set desrciptor range function int 31h ; call dpmi jc DPMI_Error xor ch,ch ; mov cl,146 ; access = Physical|Privledge|Data|Write mov ax,9 ; set desrciptor access rights function int 31h ; call dpmi mov linearDiscriptor,bx ; save for later use */ cprintf("\r\ninitial ds segment limit=%08X\r\n", get_limit()); #ifdef LINEAR if(_go32_info_block.run_mode == _GO32_RUN_MODE_DPMI) { cprintf("\r\nUSING DPMI. DPMI VERSION=%X\r\n", (unsigned)(_go32_info_block.run_mode_info)); ds_desc=_go32_my_ds(); /*figure out where DS is in physical memory*/ regs.x.ax = 0x6; /*get desrciptor linear base address*/ regs.x.bx = ds_desc; /*move desriptor to bx*/ int86(0x31, ®s, ®s); if(regs.x.flags & 1) winge("Couldnt get linear base address of ds_desc"); cprintf("\r\nCurrent DS linear base address=%04X%04X\r\n", (unsigned)(regs.x.cx), (unsigned)(regs.x.dx)); VRAM=(unsigned char *)( 0xa0000 - ( ((unsigned)(regs.x.cx) <<16) + (unsigned)(regs.x.dx) ) ); cprintf("\r\nRemapped Video RAM process virtual address=%08X\r\n", (unsigned)(VRAM)); regs.x.ax = 0x8; /*set descriptor range function*/ regs.x.bx = ds_desc; /*move descriptor to bx*/ regs.x.cx = -1; /*MSB range*/ regs.x.dx = -1; /*LSB range*/ /*Windows seems to allow the huge segment limit of -1(=0xFFFFFFFF). This would obviously lead to memory protection violations with other processes. Windows probably thinks that 0xFFFFFFFF +1 = 0 and silently allows this huge limit. Thank god for bugs! */ int86(0x31, ®s, ®s); cprintf("\r\nNew DS linear address limit=%08X\r\n", get_limit()); if(regs.x.flags & 1) winge("Couldnt set descriptor range of ds_desc"); } else { /*use non-dpmi go32 remapping of 0xa0000 */ VRAM=(unsigned char *)0xd0000000; } #endif cprintf("\r\nPRESS ANY KEY TO CONTINUE\r\n"); getkey(); /*wait for keypress*/ regs.x.ax = 0x13; /*mode 13 graphics*/ int86(0x10,®s,®s); SetCols(); /*set the palette*/ for(blits=0; blits<64000; blits++) frame_buf[blits]=blits; blits=vram_off=0; duration=clock(); /*start the timer*/ while(/*!kbhit()*/ blits<1000) { #ifdef LINEAR RAM_TO_VRAM(VRAM+vram_off, frame_buf, BLIT_SIZE); #else movedata(_go32_my_ds(), (unsigned)frame_buf, _go32_conventional_mem_selector(), 0xa0000+vram_off, BLIT_SIZE); /*go32 way of moving memory to VRAM in dpmi*/ #endif vram_off=(vram_off+65U) & 16383U; blits++; } duration=clock()-duration; /*how long did we spend blitting to screen?*/ text_mode(); cprintf("\r\n%u blits took %ums. Blit rate=%uKBytes/s\r\n", blits, duration/(CLOCKS_PER_SEC/1000U), blits*BLIT_SIZE / (duration/(CLOCKS_PER_SEC/1000U)) ); return 0; }