Mail Archives: cygwin/1999/09/24/16:25:04
The appended patch fixes a race condition in select.cc.
This code creates a Win32 socket ("exitsock") which it uses to let the
master thread notify a slave thread that it should exit a particular
call to Winsock select(). (I know, technically, that threads don't
really have a master/slave relationship; but look at the code and you
will see what I mean.)
The problem here is twofold:
1) Doing select() on something created by socket(), when that socket
has not been connected our bound to anything, does not have
defined behavior in any documentation I can find. The code in
select.cc seems to rely on some weird Winsock behavior; namely,
that calling closesocket() on a newly-created socket will trigger
an "exceptional condition" for any select() waiting on it.
2) There is a race condition anyway. It is possible for the master
thread to call closesocket() before the slave thread even gets to
the select(). The result is to select() on a value which is no
longer a legitimate socket at all. I have seen this race happen
under strace. The result is (usually) for the select() to return
with an error code, which Cygwin pretty much ignores and
continues plowing through. Sometimes the result is a complete
hang on our SMP box. (I do not have details of the failures, but
I suspect some kind of corruption in the internals of Winsock or
Cygwin.)
The enclosed patch uses a socketpair() instead of a single socket.
The slave thread now selects for reading on one member of the pair;
the master can close the other end to notify the slave it should fall
out of select(). This fixes both of the problems: The behavior is now
well-defined, and the race condition is avoided.
After installing this patch, most of the Cygwin hangs we were
experiencing are now gone. At least, I can't reproduce them in 20
seconds like I could before :-).
- Pat
P.S. You will get a compile-time warning because socketpair() is not
declared. I was unable to find it in any .h file or I would have
#included it...
======================================================================
--- cygwin-src/winsup/select.cc Wed Sep 22 23:57:31 1999
+++ cygwin-src-patl/winsup/select.cc Fri Sep 24 15:59:42 1999
@@ -991,7 +991,7 @@ struct socketinf
{
HANDLE thread;
winsock_fd_set readfds, writefds, exceptfds;
- SOCKET exitsock;
+ int exitsockpair[2];
select_record *start;
};
@@ -1090,8 +1090,9 @@ thread_socket (void *arg)
}
}
- if (WINSOCK_FD_ISSET (si->exitsock, &si->exceptfds))
- select_printf ("exitsock got an exception");
+ if (WINSOCK_FD_ISSET (dtable[si->exitsockpair[1]]->get_handle (),
+ &si->readfds))
+ select_printf ("exitsockpair was closed");
return 0;
}
@@ -1133,15 +1134,16 @@ start_thread_socket (select_record *me,
}
}
- if ((si->exitsock = socket (PF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
+ if (socketpair (PF_INET, SOCK_STREAM, 0, si->exitsockpair) != 0)
{
- __seterrno ();
select_printf ("cannot create socket");
return -1;
}
- select_printf ("exitsock %p", si->exitsock);
- WINSOCK_FD_SET ((HANDLE)si->exitsock, &si->exceptfds);
+ select_printf ("exitsocks %d %d", si->exitsockpair[0],
+ si->exitsockpair[1]);
+ WINSOCK_FD_SET (dtable[si->exitsockpair[1]]->get_handle (),
+ &si->readfds);
stuff->device_specific[FHDEVN(FH_SOCKET)] = (void *) si;
si->start = &stuff->start;
select_printf ("stuff_start %p", &stuff->start);
@@ -1157,8 +1159,9 @@ socket_cleanup (select_record *me, selec
select_printf ("si %p si->thread %p", si, si ? si->thread : NULL);
if (si && si->thread)
{
- closesocket (si->exitsock);
+ _close (si->exitsockpair[0]);
WaitForSingleObject (si->thread, INFINITE);
+ _close (si->exitsockpair[1]);
CloseHandle (si->thread);
stuff->device_specific[FHDEVN(FH_SOCKET)] = NULL;
delete si;
--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com
- Raw text -