From: sandmann AT clio DOT rice DOT edu (Charles Sandmann) Message-Id: <10110150338.AA16562@clio.rice.edu> Subject: Re: W2K/XP fncase To: eliz AT is DOT elta DOT co DOT il Date: Sun, 14 Oct 2001 22:38:21 -0500 (CDT) Cc: djgpp-workers AT delorie DOT com In-Reply-To: <7263-Sun14Oct2001200248+0200-eliz@is.elta.co.il> from "Eli Zaretskii" at Oct 14, 2001 08:02:49 PM X-Mailer: ELM [version 2.5 PL2] Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Reply-To: djgpp-workers AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp-workers AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk To further clarify what I was thinking about, attached is a prototype which shows an example. I used the test: id83 /.../* which globs all files on the disk and then tests the names. I'd be interested if anyone who runs it on a non-W2K/XP system finds any differences. It takes a long time on big disks just to complete the glob. The test program compares the behavior to the strcmp(_lfn_gen_...) combo. On my Win95 system it found one difference, a DOS game copy protection file which contains a space. On my old DOS system (from Compaq DOS 3.31... now under W95) it found one file (which also came with a game) which included a graphic (8-bit) character repeated (but no characters, so it wouldn't have been different if lowercased anyway). On the development system running Win2K, it found something like 30,000 differences, since the _lfn_gen_short_fname is broken. This means that around 40% of the files on my system are really short names and would not be properly downcased unless we fixed this (this is a high value because the system evolved from DOS development system). ----------------------------------prototype---id83.c------------------------- #include #include char _is_DOS83(const char *fname) { const char *s = fname; const char *e; char c, period_seen; if(*s == '.') /* starting period invalid */ return 0; period_seen = 0; e = s + 8; /* end */ while ((c = *s++)) if (c == '.') { if(period_seen) return 0; /* multiple periods invalid */ period_seen = 1; e = s + 3; /* already one past period */ } else if (s > e) return 0; /* name component too long */ else if (c >= 'a' && c <= 'z') return 0; /* lower case character */ else if (c == '+' || c == ',' || c == ';' || c == '=' || c == '[' || c == ']') return 0; /* special non-DOS characters */ return 1; /* all chars OK */ } #ifdef TEST #include #include int _crt0_startup_flags = _CRT0_FLAG_PRESERVE_FILENAME_CASE; /* for glob */ #define MAXDISPLAY 10 /* Example test usage: id83 * (or a file name, or ... can test whole disk) */ int main (int argc, char *argv[]) { char old, new, dif; char sh[14]; char *f; int i,j,nd; nd = 0; for(i=1;i= 0; j--) /* Trim path */ if(*(f+j) == '/') { f = f + j + 1; break; } old = !strcmp(_lfn_gen_short_fname(f, sh), f); new = _is_DOS83(f); dif = (old != new); if(dif) nd++; if(i == MAXDISPLAY) printf("Remaining test results suppressed unless different\n"); if(dif || i < MAXDISPLAY) printf ("Orig: %s isDOS: %d old: %d\n", f, new, old); } if(i >= MAXDISPLAY) printf("%d names processed, %d differences\n",i,nd); return 0; } #endif ----------------------------------------------------------------------- Further testing on DH=1 for 71A8 on 2K/XP found some potential work arounds, then several more new problems. In a nutshell, if the first part of the file name has 1 or 2 characters, you get bogus results. It will usually give correct values for 3,4,5,6 characters. It will always truncate characters 7,8. So LICENSE.TXT becomes LICENS.TXT. Yes, it's hopeless. Yes, it truncates to 6.3 for DH=0 also. I think the NT code must have been written by an old Digital RT-11 programmer since their file system was 6.3... I don't see any way to fix this other than: 1) Document it's trash and advise no one to use it, or 2) Insert emulation code if os=2K/XP inside it.