From: "Juan Manuel Guerrero" Organization: Darmstadt University of Technology To: JT Williams , Eli Zaretskii Date: Thu, 26 Jul 2001 22:50:10 +0200 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: Re: ANNOUNCE: DJGPP port of GNU Sed 3.02.80 uploaded CC: djgpp AT delorie DOT com X-mailer: Pegasus Mail for Windows (v2.54DE) Message-ID: <27453F34B00@HRZ1.hrz.tu-darmstadt.de> Reply-To: djgpp AT delorie DOT com On Tue, 24 Jul 2001 17:43:24, JT Williams wrote: > % unzip -l 3.02/sed302b.zip | grep sed.exe > 98304 04-20-00 02:03 bin/sed.exe > 97792 04-20-00 02:03 bin/gsed.exe > > % unzip -l 3.02.80/sed3028b.zip | grep sed.exe > 682752 07-22-01 02:03 bin/sed.exe > > The NLS support must be the culprit, here. Yes, this is the case. I will provide a binary without NLS. It size will be around 58KB. > BTW, is GNU regexp lib released/supported independently of sed > (i.e., could it be built and linked with other applications)? > That could be useful.... Sed uses GNU regex from Linux (glibc-2.n.n). This files are regex.[ch] and are located in the /posix subdir. A regex package called regex-0.12.tar.gz can be downloaded from ftp.gnu.org. It works but it is old. If you want to compile it, you must run the commands: sh autoconf sh configure make make check It creates a regex.o file that can be linked with the other .o files of your project. May be you can replace the original regex.[ch] with the new ones from glibc but I have never tested it. Now to the sad things. On Wed, 25 Jul 2001 11:48:10, Eli Zaretskii wrote: > I'd advise to keep the two executables in the distribution, and to > make the binary built with libc's regex the default, named sed.exe. That is ok with me but I am not able to create such a binary. I have never noticed this before because I have never tried it. The sources between sed-3.02 and sed-3.02.80 have change strongly. A new file, sed/regex.c, has been introduced. The function of this file is to isolate and abstract out the interface to the regex engine so that less conditional code is required in compile.c and execute.c. This file now contains functions like compile_regex(), match_regex(), release_regex(), etc. All this functions operate with objects of type regex_t. Very unfortunately the sources *always* assume that regex_t is *identical* with struct re_registers declared in lib/regex-gnu.h (regex-gnu.h is the regex header from glibc-2.n./posix). Every other system that does not implement regex_t in the regex-gnu.h way looses. It is realy very instructive to look at sed-3.02/sed/compile.c and compare it with sed-3_02.80/sed/regex.c, **especialy** the compile_regex() function. The old function only allocates the regex_t *new_regex structure, the new one allocates and initialises the structure's members makeing the assumption that regex_t is **identical** to the structure declared in regex-gnu.h and this kills my attempt to use DJGPP's implementation of regex. It may be instructive to see what happens on a cygwin system if the --with-regex=libc.a option is used. Unfortunately I have cygwin not installed. Unfortunately I see no cure for all this. The only way out seems to be to write DJGPP specific functions for all functions in regex.c, compile.c and execute.c that manipulate regex_t objects. At this point, every intelligent suggestion will be seriously welcome. I know that this is of no much use but FYI anyway: This is the compiling process output: Making all in sed make.exe[2]: Entering directory `d:/sed/2/sed' gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c sed.c In file included from sed.c:63: sed.h:149: warning: `struct re_registers' declared inside parameter list sed.h:149: warning: its scope is only this definition or declaration, which is probably not what you want. gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c compile.c In file included from compile.c:50: sed.h:149: warning: `struct re_registers' declared inside parameter list sed.h:149: warning: its scope is only this definition or declaration, which is probably not what you want. gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c execute.c In file included from execute.c:68: sed.h:149: warning: `struct re_registers' declared inside parameter list sed.h:149: warning: its scope is only this definition or declaration, which is probably not what you want. execute.c: In function `do_subst': execute.c:816: storage size of `regs' isn't known make.exe[2]: *** [execute.o] Error 1 gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c regex.c In file included from regex.c:32: ../lib/regex-gnu.h:253: warning: `REG_EXTENDED' redefined f:/include/regex.h:33: warning: this is the location of the previous definition ../lib/regex-gnu.h:257: warning: `REG_ICASE' redefined f:/include/regex.h:34: warning: this is the location of the previous definition ../lib/regex-gnu.h:262: warning: `REG_NEWLINE' redefined f:/include/regex.h:36: warning: this is the location of the previous definition ../lib/regex-gnu.h:266: warning: `REG_NOSUB' redefined f:/include/regex.h:35: warning: this is the location of the previous definition ../lib/regex-gnu.h:276: warning: `REG_NOTBOL' redefined f:/include/regex.h:67: warning: this is the location of the previous definition ../lib/regex-gnu.h:279: warning: `REG_NOTEOL' redefined f:/include/regex.h:68: warning: this is the location of the previous definition In file included from regex.c:31: sed.h:149: warning: `struct re_registers' declared inside parameter list sed.h:149: warning: its scope is only this definition or declaration, which is probably not what you want. In file included from regex.c:32: ../lib/regex-gnu.h:291: parse error before `1' ../lib/regex-gnu.h:392: conflicting types for `regex_t' f:/include/regex.h:23: previous declaration of `regex_t' ../lib/regex-gnu.h:395: warning: redefinition of `regoff_t' f:/include/regex.h:17: warning: `regoff_t' previously declared here ../lib/regex-gnu.h:423: conflicting types for `regmatch_t' f:/include/regex.h:27: previous declaration of `regmatch_t' ../lib/regex-gnu.h:543: conflicting types for `regcomp' f:/include/regex.h:31: previous declaration of `regcomp' ../lib/regex-gnu.h:550: conflicting types for `regexec' f:/include/regex.h:66: previous declaration of `regexec' ../lib/regex-gnu.h:555: conflicting types for `regerror' f:/include/regex.h:62: previous declaration of `regerror' ../lib/regex-gnu.h:558: conflicting types for `regfree' f:/include/regex.h:76: previous declaration of `regfree' regex.c:53: conflicting types for `compile_regex' sed.h:146: previous declaration of `compile_regex' regex.c: In function `match_regex': regex.c:141: argument `regex' doesn't match prototype sed.h:149: prototype declaration regex.c:141: argument `regarray' doesn't match prototype sed.h:149: prototype declaration make.exe[2]: *** [regex.o] Error 1 gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c utils.c make.exe[2]: Target `all' not remade because of errors. make.exe[2]: Leaving directory `d:/sed/2/sed' sed.h includes and "regex-gnu.h". regex-gnu.h is needed because of some function prototypes and some structure declarations. I have replaced lines like: typedef int regoff_t; in regex-gnu.h with lines like this one: #if !defined (HAVE_REGEX_H) && defined (__DJGPP__) /* This typedef may conflict with the typedef from system's regex.h. */ typedef int regoff_t; #endif After having done this I have compiled again. This is the output: Making all in sed make.exe[2]: Entering directory `d:/sed/2/sed' gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c execute.c gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -I../intl -I../intl -g -O2 -c regex.c In file included from regex.c:32: regex.c: In function `compile_regex': regex.c:88: structure has no member named `buffer' regex.c:89: structure has no member named `allocated' regex.c:90: structure has no member named `used' regex.c:91: structure has no member named `syntax' regex.c:92: structure has no member named `fastmap' regex.c:93: structure has no member named `translate' regex.c:94: structure has no member named `regs_allocated' regex.c:95: structure has no member named `no_sub' regex.c:96: structure has no member named `not_bol' regex.c:97: structure has no member named `not_eol' regex.c:98: structure has no member named `newline_anchor' regex.c:111: structure has no member named `translate' regex.c:122: warning: passing arg 3 of `re_compile_pattern' from incompatible pointer type regex.c:127: structure has no member named `regs_allocated' regex.c:128: structure has no member named `newline_anchor' regex.c: In function `match_regex': regex.c:147: warning: passing arg 1 of `re_search_2' from incompatible pointer type make.exe[2]: *** [regex.o] Error 1 make.exe[2]: Leaving directory `d:/sed/2/sed' make.exe[1]: *** [all-recursive] Error 1 make.exe[1]: Leaving directory `d:/sed/2' make.exe: *** [all-recursive-am] Error 2 The members are of the structure of type regex_t and this structure does not match the DJGPP declaration of regex_t. In conclusion: 1) I am not able to supply a sed program compiled with DJGPP's regex. Netherless the sed program compiled with GNU regex has worked for me for two jears without *any* difficulties. 2) The reason for the huge size of sed.exe is the NLS support. I will fix this. I will wait a couple of days to see if someone presents a usefull idea how this DJGPP specific issue can be solved. If no one has a good idea I will upload new binary and source packages. The sources will contain a modified config.bat to select GNU regex or DJGPP regex defaulting to GNU regex (the only thing that works today). The binary package will contain two sed programs: gsed.exe using GNU regex, size = 58KB. nlsgsed.exe using GNU regex and NLS, size = 667KB After installing the binary package, the user will still have the old fast sed.exe available and the new ones with the new sed-3.02.80 functionality and the NLS support. As usual, comments, suggestions, etc. are welcome. Regards, Guerrero, Juan M.