Mailing-List: contact cygwin-apps-help AT sourceware DOT cygnus DOT com; run by ezmlm list-help: list-post: Sender: cygwin-apps-owner AT sourceware DOT cygnus DOT com Delivered-To: mailing list cygwin-apps AT sourceware DOT cygnus DOT com From: Chris Faylor Date: Sun, 25 Jun 2000 22:38:32 -0400 To: cygwin-apps AT sourceware DOT cygnus DOT com Subject: Re: Pending change to cygwin DLL and binmode/textmode musings Message-ID: <20000625223831.A2385@cygnus.com> Reply-To: cygwin-apps AT sourceware DOT cygnus DOT com Mail-Followup-To: cygwin-apps AT sourceware DOT cygnus DOT com References: <20000624234013 DOT A29970 AT cygnus DOT com> <005601bfde69$c66fa0c0$f7c723cb AT lifelesswks> <20000625120826 DOT C790 AT cygnus DOT com> <003401bfdf07$3f5196e0$f7c723cb AT lifelesswks> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <003401bfdf07$3f5196e0$f7c723cb@lifelesswks>; from robert.collins@itdomain.com.au on Mon, Jun 26, 2000 at 10:41:16AM +1000 On Mon, Jun 26, 2000 at 10:41:16AM +1000, Robert Collins wrote: >> On Sun, Jun 25, 2000 at 03:54:01PM +1000, Robert Collins wrote: >> >I like the idea of a database of files... that would mean less porting >> >issues, particularly with programs that act on files in common with other >> >tools. >> >> Personally, I don't like the thought of maintaining a list. I think we'll >> constantly be saying "Update your /etc/filemodes to the newest version". > >...instead of please update ? Which is >easier/more/less error prone? Assuming that it is relatively easy to update a program, which it sort of is, then it seems obvious to me that saying "grab the latest awk" will be better (or at least equivalent to) "grab the latest /etc/filemodes". If /etc/filemodes is accidentally deleted or the user resets their /etc mount or if they download a "new" /etc/filemodes then they will cause themselves problems. >> And, as the file grows larger it will take time to parse, slowing down >> every cygwin application. I guess we could get around that by setting >> some kind of flag in the executable but if we are going to do that then >> why not just record the filenames in the executable itself. > >a) Well I think what's really under discussion is some sort of new file >attribute - and with a hash table on the name of the file the lookup would >stay very quick for reasonable numbers of files. (check list file date, if >modified rebuild hash table, otherwise just lookup). that wouldn't work if >you were planning wildcards like > set_default_open ("/*/*.c", O_BINARY); >but I sure a reasonably fast implementation could be designed. I'm not sure why you are using set_default_open as an example since this has nothing to do with the alternate file-based method of doing things. Maybe you're advocating that programs that need it could open a file and load the appropriate tables. I didn't plan on implementing any kind of pattern matching since that would slow a program down. It would only slow down a program that used it, though. The fact remains that no matter how fast you make the opening and reading of a file it would slow down every single cygwin operation to some degree. If you try to avoid opening and parsing the file based on file dates that means that you have to store YA thing in cygwin's shared memory. That's not intrinsically bad but it is not currently designed to grow without bounds. If you don't use cygwin's shared memory then any program that needs the information has to open the external file. I don't see any way around this. I'm not sure what you mean about a file attribute, though. It is already possible to mount a file as "text" if you want and cygwin should do the right thing. If we're only talking about a minimal number of system files, then we could just provide a default set of mounts: mount -t c:\cygwin\etc\passwd /etc/passwd mount -t c:\cygwin\etc\group /etc/group mount -t c:\cygwin\etc\termcap /etc/termcap and let people manage this kind of thing through the mount table. >b) As we are only talking about setting defaults for the common files on the >system, the number of files in list w/wildcards should not get beyond a >couple of hundred anyway, and a non-wildcarded database could uses hashes. > >> I'd rather have the programs operate correctly without external >> dependencies. > >I agree with this, but no file stands on it's own - look at the amount of >(excellent) work done to date on cygwin... that is an external dependency. I >believe minimising the changes needed to move source between platforms is a >Good Thing. Sure, the OS is an external dependency. Cygwin1.dll is an external dependency, Kernel32.dll is an external dependency. Luckily, they are all pretty much transparent. I don't think that an external text file would be transparent. >One benefit of an external database that I missed before is that when folk >download un-cygwinised source that compiles and runs with only text-mode >issues, the problem will be caught _by the platform_.. no questions needed >on the mailing lists. Here is where a wildcard or perhaps a regex style >definition in an external database could be very useful...ie >====file starts====== >[O_TEXT] >/.*\.c >/.*\.h >/.*\.cpp >/.*[Mm]akefile >[O_BINARY] >/.*\.o >/.*\.a >======file ends===== I still think that 1) opening and reading a file and 2) doing multiple regex pattern matching on every opened filename is not a performance hit that we want to consider. IMO, all of your examples above are currently non-issues due to the fact that gcc and make should now be correctly interpreting files with \r\n line endings. Now that we can easily update individual patches and get them to the cygwin community quickly, I think we'll see a drastic cutoff in complaints about textmode issues as we clean up utilties like make. What I was trying to do was provide a method to easily and minimally modify a package to ease the effort in poring over files, looking for fopens to change. The external file method is attractive because it allows us to "fix" a utility quickly but I am not comfortable with the other tradeoffs. cgf