X-Authentication-Warning: delorie.com: mail set sender to djgpp-workers-bounces using -f X-Recipient: djgpp-workers AT delorie DOT com X-Authenticated: #27081556 X-Provags-ID: V01U2FsdGVkX182tOcWBvs2wDjugBuNB7pTBUD3XdE7koMdOXtzeE fcghkSrGROS1zP From: Juan Manuel Guerrero To: djgpp-workers AT delorie DOT com Subject: Re: About the syntax of the change file for djtar Date: Thu, 8 Nov 2007 22:11:18 +0100 User-Agent: KMail/1.9.5 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200711082211.19884.juan.guerrero@gmx.de> X-Y-GMX-Trusted: 0 Reply-To: djgpp-workers AT delorie DOT com > We can choose a non-NUL byte if that would help; perhaps 0x7f or 0x01. That seems to me the easiest and probably the best approach to the issue. The precedence of the separating chars is from high to low: 0x01 tab space I have chosen 0x01 as separating char. To keep backward compatibility the code line: sscanf(line, "%s %s", from, to); has been emulated as good as possible. This means the stripping of leading and trailing white space. Regards, Juan M. Guerrero 2007-11-09 Juan Manuell Guerrero * src/utils/utils.tex: Info about the new field separating chars in the change file. * src/docs/kb/wc204.txi: Info about the new field separating chars in the change file. * src/utils/djtar/djtar.c (DoNameChanges): Implementation of use of tab char and 0x01 as alternate field separating chars. diff -aprNU3 djgpp.orig/src/docs/kb/wc204.txi djgpp/src/docs/kb/wc204.txi --- djgpp.orig/src/docs/kb/wc204.txi 2005-05-11 20:06:08 +0000 +++ djgpp/src/docs/kb/wc204.txi 2007-11-09 21:17:38 +0000 @@ -1094,3 +1094,8 @@ formats for @code{"%x"} and @code{"%X"} @pindex djasm AT r{, cr4 register} @code{djasm} recognises the fourth control register, @code{cr4}. + +@pindex djtar AT r{, name change file format} +To allow for directory and file names that may contain space and +tab characters the list of recognized file name separating characters +has been incremented by the tab character and 0x01 byte. diff -aprNU3 djgpp.orig/src/utils/djtar/djtar.c djgpp/src/utils/djtar/djtar.c --- djgpp.orig/src/utils/djtar/djtar.c 2002-10-17 23:00:26 +0000 +++ djgpp/src/utils/djtar/djtar.c 2007-11-09 20:59:56 +0000 @@ -5,6 +5,7 @@ /* Copyright (C) 1996 DJ Delorie, see COPYING.DJ for details */ /* Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details */ #include +#include #include #include #include @@ -99,6 +100,7 @@ get_entry(char *from) static void DoNameChanges(char *fname) { +#define IS_WHITESPACE(s) ((s) == ' ' || (s) == '\t') struct skip_dir_list * new_entry; FILE *f = fopen(fname, "r"); char from[PATH_MAX], to[PATH_MAX]; @@ -110,13 +112,61 @@ DoNameChanges(char *fname) } while (1) { + char *field_separator; + size_t length; + fgets(line, sizeof(line), f); if (feof(f)) break; - to[0] = 0; - sscanf(line, "%s %s", from, to); - if (to[0]) + + if (*line == '\n') + continue; + field_separator = strchr(line, '\1'); + if (!field_separator) + field_separator = strrchr(line, '\t'); + if (!field_separator) + field_separator = strrchr(line, ' '); + if (field_separator) + { + /* + * Strip white space before and behind the field separator + * and check for the existence of the to-string. + */ + size_t len = field_separator - line; + length = len; + while (IS_WHITESPACE(line[length]) || line[length] == '\1') + length--; + if (length != len) + length++; + + while (IS_WHITESPACE(line[len]) || line[len] == '\1') + len++; + if (line[len] == '\n' || line[len] == '\0') + field_separator = NULL; /* No to-string. */ + else + field_separator = line + len; /* Start of to-string. */ + } + else + { + length = strlen(line); + if (line[length - 1] == '\n') + length--; + } + memcpy(from, line, length); + from[length] = '\0'; + if (field_separator) + { + strcpy(to, field_separator); + length = strlen(to); + if (to[--length] == '\n') + length--; + while (IS_WHITESPACE(to[length])) + length--; + if (!IS_WHITESPACE(to[length])) + length++; + to[length] = '\0'; store_entry(from, to); + } else { new_entry = xmalloc(sizeof(struct skip_dir_list)); @@ -126,6 +176,7 @@ DoNameChanges(char *fname) } } fclose(f); +#undef IS_WHITESPACE } /*------------------------------------------------------------------------*/ diff -aprNU3 djgpp.orig/src/utils/utils.tex djgpp/src/utils/utils.tex --- djgpp.orig/src/utils/utils.tex 2004-01-10 21:55:48 +0000 +++ djgpp/src/utils/utils.tex 2007-11-09 22:45:38 +0000 @@ -285,6 +285,22 @@ The directories must be complete, not re must match the complete path in the tar file, and the ``new'' directories indicate where the file goes on the DOS disk. If there is no ``new'' directory specified, the ``old'' one and all its siblings will be not extracted. +The space and tab characters and the 0x01 byte will be recognized as separating +characters between the ``old'' directories and filenames and the ``new'' +directories and filenames in the @file{changeFile} file. The highest precedence +of the separating characters has the 0x01 byte, followed by the tab character +and with the lowest precedence the space character. If neither the ``old'' nor +the ``new'' directories and filenames contain spaces nor tabs, then every +one of the three separator characters is allowed. If space characters are part +of the ``old'' and ``new'' directories and filenames, then the tab character or +the 0x01 byte must be used as separating character. If the space character +appears only in the ``old'' filename but @emph{neither} in the ``new'' directory +nor filename, then the space character can still be used as separating character. +If tab characters or tab and space characters are part of the ``old'' and ``new'' +directories and filenames, then the 0x01 byte must be used as separating character. +If the space and tab characters appear only in the ``old'' filename but +@emph{neither} in the ``new'' directory nor filename, then the tab character +can still be used as separating character apart from the 0x01 byte. @item -d