X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.0 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED X-Spam-Check-By: sourceware.org From: "Wagemans, Peter" To: "cygwin AT cygwin DOT com" Date: Fri, 22 Jun 2012 00:31:17 +0200 Subject: RE: RCS file corruption. Message-ID: References: <1256297426 DOT 20120622004455 AT mtu-net DOT ru> <20120621210043 DOT M43274 AT ds DOT net> In-Reply-To: <20120621210043.M43274@ds.net> Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id q5LMVf5l005368 Brian Wilson wrote: > From what I've read in this discussion, I think the issue is that > the '^M' characters may not be seen by RCS as an EOL. The problem occurs in a loop that copies one character at a time to move the entire content of the work file into the new RCS file as the latest version. It unexpectedly gets EOF back from getc() after exactly 65536 characters. Using the sysinternals tool procmon, one can see what the processes are asking of the Windows OS. This was done for Cygwin rcs-5.8-1, showing the readfile operations on the work file: Time of Day Process Name Operation Result Detail 13:40:06.563 ci.exe ReadFile SUCCESS Offset: 0, Length: 65,536, Priority: Normal 13:40:06.685 diff.exe ReadFile SUCCESS Offset: 0, Length: 1,593,857 13:40:06.686 diff.exe ReadFile END OF FILE Offset: 1,593,857, Length: 1 13:40:06.732 ci.exe ReadFile END OF FILE Offset: 1,593,857, Length: 65,536 The RCS check-in tool ci.exe reads the start of the work file before it starts diff.exe as a subprocess. The task of diff.exe is to figure out the difference with the previous version of the file. To do this, diff.exe reads the entire file (from stdin, the file descriptor of the work file is supplied to diff.exe by ci.exe). After that ci.exe wants to copy the content of the work file to the RCS file. But at the end of the first 64kB (that it already has in the buffer), it appears that ci.exe wants to read the next 64kB at the end of the file. So it gets an EOF. This causes the truncation to 64kB of the content of the last version in the RCS file. It is not clear to me why ci.exe tries to read the second 64kB at the end of the work file. Perhaps some (library) code uses/sets an incorrect file position; perhaps influenced by the subprocess diff.exe reading the entire file? A similar procmon trace for Cygwin rcs-5.7-11 trace shows that this older rcs version chooses to create a memory map of the work file. This other access method apparently avoids the problem. Regards, Peter Wagemans -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple