www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/10/14/23:43:14

From: sandmann AT clio DOT rice DOT edu (Charles Sandmann)
Message-Id: <10110150338.AA16562@clio.rice.edu>
Subject: Re: W2K/XP fncase
To: eliz AT is DOT elta DOT co DOT il
Date: Sun, 14 Oct 2001 22:38:21 -0500 (CDT)
Cc: djgpp-workers AT delorie DOT com
In-Reply-To: <7263-Sun14Oct2001200248+0200-eliz@is.elta.co.il> from "Eli Zaretskii" at Oct 14, 2001 08:02:49 PM
X-Mailer: ELM [version 2.5 PL2]
Mime-Version: 1.0
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

To further clarify what I was thinking about, attached is a prototype
which shows an example.  I used the test:
id83 /.../*
which globs all files on the disk and then tests the names.  I'd be
interested if anyone who runs it on a non-W2K/XP system finds any
differences.  It takes a long time on big disks just to complete the
glob.

The test program compares the behavior to the strcmp(_lfn_gen_...) combo.
On my Win95 system it found one difference, a DOS game copy protection
file which contains a space.

On my old DOS system (from Compaq DOS 3.31... now under W95) it found
one file (which also came with a game) which included a graphic (8-bit)
character repeated (but no characters, so it wouldn't have been 
different if lowercased anyway).

On the development system running Win2K, it found something like 30,000
differences, since the _lfn_gen_short_fname is broken.  This means that
around 40% of the files on my system are really short names and would
not be properly downcased unless we fixed this (this is a high value
because the system evolved from DOS development system).

----------------------------------prototype---id83.c-------------------------
#include <libc/stubs.h>
#include <fcntl.h>

char _is_DOS83(const char *fname)
{
  const char *s = fname;
  const char *e;
  char c, period_seen;

  if(*s == '.')			/* starting period invalid */
    return 0;

  period_seen = 0;
  e = s + 8;			/* end */

  while ((c = *s++))
    if (c == '.') {
      if(period_seen)
        return 0;		/* multiple periods invalid */
      period_seen = 1;
      e = s + 3;		/* already one past period */
    } else if (s > e)
      return 0;			/* name component too long */
    else if (c >= 'a' && c <= 'z')
      return 0;			/* lower case character */
    else if (c == '+' || c == ',' || c == ';' || 
             c == '=' || c == '[' || c == ']')
      return 0;			/* special non-DOS characters */

  return 1;			/* all chars OK */
}

#ifdef TEST
#include <stdio.h>
#include <crt0.h>

int _crt0_startup_flags = _CRT0_FLAG_PRESERVE_FILENAME_CASE; /* for glob */

#define MAXDISPLAY 10

/* Example test usage: id83 * (or a file name, or ... can test whole disk) */

int main (int argc, char *argv[])
{
  char old, new, dif;
  char sh[14];
  char *f;
  int i,j,nd;
  nd = 0;
  for(i=1;i<argc;i++) {
    f = argv[i];
    for(j=strlen(f); j >= 0; j--)	/* Trim path */
      if(*(f+j) == '/') {
        f = f + j + 1;
        break;
      }
    old = !strcmp(_lfn_gen_short_fname(f, sh), f);
    new = _is_DOS83(f);
    dif = (old != new);
    if(dif)
      nd++;
    if(i == MAXDISPLAY)
      printf("Remaining test results suppressed unless different\n");
    if(dif || i < MAXDISPLAY)
      printf ("Orig:  %s isDOS: %d old: %d\n", f, new, old);
  }
  if(i >= MAXDISPLAY)
    printf("%d names processed, %d differences\n",i,nd);
  return 0;
}

#endif

-----------------------------------------------------------------------

Further testing on DH=1 for 71A8 on 2K/XP found some potential work arounds, 
then several more new problems.  In a nutshell, if the first part of the file
name has 1 or 2 characters, you get bogus results.  It will usually
give correct values for 3,4,5,6 characters.  It will always truncate 
characters 7,8.  So LICENSE.TXT becomes LICENS.TXT.  Yes, it's hopeless.
Yes, it truncates to 6.3 for DH=0 also.  I think the NT code must have
been written by an old Digital RT-11 programmer since their file system
was 6.3...

I don't see any way to fix this other than:
1) Document it's trash and advise no one to use it, or
2) Insert emulation code if os=2K/XP inside it.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019