Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com Message-ID: <004401c22031$7f167620$6132bc3e@BABEL> From: "Conrad Scott" To: References: <001901c21eee$8aadf060$0200a8c0 AT lifelesswks> <034301c21f49$a989f6e0$6132bc3e AT BABEL> <002e01c21fca$f0b19560$1800a8c0 AT LAPTOP> Subject: Threads and the C++ new and delete operators Date: Sun, 30 Jun 2002 13:27:26 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Dear list, I've been suffering from what looks like a memory corruption (?) in cygserver and it's been doing my head in. I think I've found the solution, which I thought I'd share with everyone since it doesn't seem to be in the archives. AFAICT the problem is that the C++ new and delete operators are not thread-safe in the current cygwin g++ release. I knew there was some issue with this release of gcc/g++ not being compiled with --enable-threads but since threads are being used by the DLL itself, I thought that threads and C++ were basically okay. What I've been seeing is a segmentation fault in either __builtin_new or __builtin_delete when thrashing the cygserver with continuous shm requests. A common stress test I'm running here uses a hundred process, each with three threads, all running continuous shmget(2) calls on the same shared memory segment. I can re-create the problem with fewer clients, but it's easier to generate with that sort of load. If the segv doesn't occur, everything works fine, there are no other symptoms. I've disassembled the __builtin_delete operator, and the address at which the fault appears seems to be something to do with exception handling code before the function returns. Since gcc is not compiled with threading, there is one global exception handler context object. Presumably, if one thread does a new or delete while another thread is doing likewise, the scene is set for bogosity when one of the threads tries to unwind its exception handling state on return. I've just re-built cygserver without any use of new and delete (I've replaced them with free/malloc, placement new, and explicit calls to destructors). The application is also compiled -fno-exceptions, so the only exception-handling code linked in is in the builting new and delete operators, and I'm no longer calling them. Now, I can't get it to fall over. That's not exactly proof of anything for a multi-threaded application but I was managing to kick it over regularly before. (I've also been testing all of this in a mingw version of cygserver but I was having exactly the same problem with the cygwin version.) If my diagnosis is correct, I'm surprised it's not been seen before (like, in the cygwin DLL itself?). Then again, I'm having to stress the program pretty heavily to trip it up. Now, this might all be academic, what with the looming (?) arrival of gcc 3.1 for cygwin, but I thought I'd share the results of several days work . . . Any comments? Better ideas? // Conrad