From: earnie_boyd AT yahoo DOT com (Earnie Boyd) Subject: Re: vim oddities (this message is somewhat verbose) 25 Jan 1999 21:35:38 -0800 Message-ID: <19990125150231.3297.rocketmail.cygnus.gnu-win32@send104.yahoomail.com> Reply-To: earnie_boyd AT yahoo DOT com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: Christopher Murray Cc: cygwin users ---Christopher Murray wrote: > > Hi there, Hi Christopher 8< > My understanding of the binary and text processing is this (could someone > please correct me if I am wrong?): And it is important to understand this. It is not a trivial matter when trying to support multiple worlds. > > 1) All writes to disk by DOS/WINDOWS programs, whether they be binary or > ascii text files, sticks two characters at the end of each line, \r\n, and No! When processing in text mode the \r\n is output. Whe processing in binary mode only the \n is output. > these characters are actually stored along with the rest of the file, when > the file is saved to disk. This happens at the level of the I/O libraries > in the compiler. > On the other hand, files generated by UNIX stick just the \n at the end of > each line, no matter if the file is binary or text. True, at least for most UNIX'es. UNIX tends to ignore the mode of processing. > > 2) When a C program places a call to fopen, so that the file may either be > read from or written to, the programmer can opt to pass the "b" flag, > telling it to treat the file as a binary file (i.e. won't account for a \r > if it is there). If the "b" is not passed (default), it strips out the \r > as lines are read, and all is hunky dory. If the \r is not there, then the > fopen call still behaves the same - it just didn't go that extra step to > strip out the \r. > > If the "b" is passed, then the fopen is essentially being told the file > WILL be binary (i.e. no \r) and only a \n should be expected. So when \r's > are there, everything gets screwed up because the fopen expects only a \n, > but that one extra byte is in there when it shouldn't (which is why fseek, > etc have trouble - the byte counts are worng) > No! If you pass the "b" option in fopen and the file contains \r\n then when the line is fread the \r will be contained in the data, the number of bytes read will be correct and using fseek _will_not_ be a cause of problems. If you don't pass the "b" in fopen the fread will strip the \r from the \r\n and return 1 less than the number of bytes actually read to represent the number of bytes returned in the line. This _WILL_ cause you problems when using fseek as you try to count the number of bytes read in comparison to the number of bytes in the file. ALSO, when fread encounters a C-z (Ctrl-Z) it triggers the EOF (end of file). NOTE: If the file being read contains \n line endings, there will not be a problem in either mode. fread will correctly return the number of bytes read however C-z will still trigger EOF. A rule of thumb: If I know which processing method is to be used I specify that processing method. Therefore if I'm creating a report to be read by humans I process in text mode. If I read a file that humans potentially create with an editor I process this file in text mode. All other files I process in binary mode. I always specifically specify the mode of operation "rt", "rb", etc. because the default mode can be changed. Another concern is the mode of stdin and stdout and if the program is used with piping and redirection then considerations for changing the mode with the setmode function need to be taken. > 3) Now having said this, an editor like vim would I suspect try to open up > files without the "b" flag, indicating that in a DOS environment, the \r > gets stripped by default if it is there. A check of all the > "open" calls indicates that the binary mode flag is not passed. So what > gives? Why does it show the > ^M's ?? Vim shows the mode it opens the file in when it first opens the file. There are two reasons for seeing the ^M. 1) Vim opens a text mode file in binary mode. This can be specified with a -b switch. 2) The file contains \r\r\n in which case the extra \r is showing up in the file. > > I am severely missing something. Not understanding C all that well, I have > tried modifying those calls to open > that I think are the ones that open up files given at the command line, but > I can't get things to work. > > Can someone out there who is more knowledgeable than I please help? I hope I have. == - \\||// -------------------o0O0--Earnie--0O0o------------------- -- earnie_boyd AT yahoo DOT com -- -- http://www.freeyellow.com/members5/gw32/index.html -- ----------------------ooo0O--O0ooo---------------------- PS: Newbie's, you should visit my page. _________________________________________________________ DO YOU YAHOO!? Get your free @yahoo.com address at http://mail.yahoo.com - For help on using this list (especially unsubscribing), send a message to "gnu-win32-request AT cygnus DOT com" with one line of text: "help".