Date: Thu, 25 Aug 1994 18:37:01 -0400 (EDT) From: Edwin Phillips Subject: Re: Catastrophic problems with v. 1.12 To: HANS ELLENBERGER Cc: djgpp On Thu, 25 Aug 1994, HANS ELLENBERGER wrote: > As you might have learned from all my previous mails I would of course > like to cooperate as much as possible. > > Since I am stuck with the current project I, also spent more than 12 hours > trying to find the malloc bug. Due to very scarce comments in the sources > I found no solution: > > First I tried to rebuild the libray with debugging defined in malloc.c, > but make always crashed before the library was built. > > Then I tried to figure out what actions are taken by sbrk when called from > moreram. Somewhere close to cant_ask_for() I lost track of what your code > intended to do... :-(( > > My conclusion after all: > > It's just plain luck (some call it Murphy's law) that some useful > applications can be built, but some subtle bugs still remain to > be caught. > > Since the pointers returned by sbrk/malloc do no longer overlap, it is > difficult to imagine why segment violations always occur at malloc+200, > regardless what library function called it. Most often calls of stat and > spawn trigger it, but other library functions do it too. > > In addition to that, sometimes a variable holding the handle of an open > file, which was < 10 when opened, is magically changed to 26 and this > triggers an error when calling any file operations with that modified > handle value. > > >From these observations I guess that somewhere (in the library?) a > dangling pointer might overwrite static data used by malloc. > > This might be detected by using a watchpoint in gdb, but only after > having sucessfully rebuilt the library with debug information. > Do you have a simple piece of code that can demonstrate these problems? Maybe you can narrow it down to, say, "Here the variable t is 5, I call printf(), and now t is 26". It is possible that there is a problem with malloc(), but wouldn't everyone's GCC compiler for every CPU be reporting these kind of problems? Can you step through your program and show us the code where this happens? The main problem with debugging this is you can mess up the heap and the program can continue indefinitely before malloc try to dereference a bad pointer. I realize that it may not be convenient to step through your program, but .... If you can grab the malloc lib that was discussed earlier on this list, you can put some heap checking at strategic parts of your code, and at least narrow the search down. Good luck, Ed /****************************************************************************/ /* Ed Phillips flaregun AT strauss DOT udel DOT edu University of Delaware */ /* Jr Systems Programmer (302) 831-6082 IT/Network and Systems Services */ /****************************************************************************/