Date: Sat, 13 Oct 2001 19:41:52 +0200 From: "Eli Zaretskii" Sender: halo1 AT zahav DOT net DOT il To: sandmann AT clio DOT rice DOT edu Message-Id: <2957-Sat13Oct2001194151+0200-eliz@is.elta.co.il> X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.9 CC: djgpp-workers AT delorie DOT com In-reply-to: <10110131551.AA12493@clio.rice.edu> (sandmann@clio.rice.edu) Subject: Re: W2K/XP fncase [was Re: New perl package] References: <10110131551 DOT AA12493 AT clio DOT rice DOT edu> Reply-To: djgpp-workers AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp-workers AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > From: sandmann AT clio DOT rice DOT edu (Charles Sandmann) > Date: Sat, 13 Oct 2001 10:51:08 -0500 (CDT) > > > > No, _lfn_gen_short_fname is the direct interface to the Windows > > interrupt. It does not exist merely to downcase file names when we > > think we should. > > That's not what it does - there is no interrupt for this in DOS, and > _lfn_gen_short_name has code there that converts the string to upper > case and truncates it to 12 characters. If this is really a wrapper > for the Windows interrupt it should either fail on regular DOS or > return the string unchanged. It simply does on DOS the equivalent of what Windows would have done in that case. We do similar things in other functions, e.g., _get_volume_info. > It also appears that in each of the 7 places this appears in the > library it is part of a strcmp with the long name - many of which are > not directly fncase related. Even more interesting is that in none of > those 7 places is the short name returned used at all except in > the string comparison. All true, but the function is also meant to be used by applications. Do we really want to go out and check that none does? Why waste our time? It's well known that once you provide an external function, there's no way back--the genie is out of the bottle for good. We _could_ replace _lfn_gen_short_name's body by an equivalent code, but that's not the case here. > So this function is not actually used > anywhere in the library and each of these 7 places could be replaced > by an even simpler copy of what I provided - which just returns a > true or false flag if any characters would be changed. Whether we do or don't replace the code which calls _lfn_gen_short_name in the library is a separate matter. What I was arguing in this part was that the new code cannot be called _lfn_gen_short_name because it isn't equivalent to what _lfn_gen_short_name does now. > For example, if lfn=n we should always lower case > the names (a very simple test) instead of needing to generate a > string we strcmp with, throw away and then duplicate this behavior. That would preclude a possibility to see file names on DOS in their original UPPER case; for example, try "djecho [A-Z]*" on plain DOS. IIRC, some package (Groff?) depends on that for its build procedure. > > I'm still puzzled why a global non-trivial change is deemed better > > than something localized to a specific OS in an otherwise proven > > function. The current support for LFN-related features took several > > releases to get right; do we really want to put that at jeopardy for > > the sake of saving a few cycles? > > No, but since it is unreliable, is used in 7 different places, we need > some way to fix this. I'd like something consistent between the > operating systems for something as simple (and relatively unimportant) > as what case short file names are returned in. I agree with the goal; the argument is about the way to achieve that goal. This issue is full of hidden gotchas and unintended consequences, because Microsoft's implementation of case-preservation is semi-broken, haphazard, and sometimes downright nonsensical. I have scars from fine-tuning these issues all over my heart, and I'm too old to see it (my heart) broken again. We don't even have a test suite that is extensive enough to test the effect of such changes, so most probably we won't know until it's too late. All I want is that we don't break what took so long to get right. So maybe the code I wrote is wasteful. I understand that it might bug you to see a function which issues an RM interrupt, and whose output is used inefficiently, or even not used at all. But it works; it was proven by two years of intensive use; and it certainly isn't a bottleneck in any real-life application. Therefore, my suggestion is: let's make a local change in _lfn_gen_short_name so that it calls 71A8h with DH=1 on W2K and XP. (We should see that this doesn't break NT with the LFN TSR.) The file names which come bogus as the result are very rare, and when they do happen all that we'll see is that the file name is not downcased when it should have been--not a big deal IMHO. If we really want to get fancy, we could try to repair the result that W2K returns. But I'm not suggesting that: if the underlying OS call is buggy, it is perfectly okay to return the messed up name it gives us. However, you are doing the work, so eventually it's your call. If you want to introduce a new function with the body you sent a while ago, and rewrite the other library functions to call it instead of _lfn_gen_short_name, feel free to go ahead and do it.