Ticket #733 (assigned defect )

Opened 4 years ago

Last modified 1 month ago

twisted's SIGCHLD handler breaks popen.

Reported by: jknight Assigned to: glyph (accepted)
Type: defect Priority: high
Milestone: Component: core
Keywords: Cc: glyph, exarkun, spiv, itamarst, jknight, titty, thijs
Branch: Author:
Launchpad Bug:

Description (last modified by exarkun)

Highest priority is for regressions and reviews.

Attachments

Change History

  2004-09-30 06:57:33+00:00 changed by jknight

This is a simple way to demonstrate:
>>> from twisted.internet import reactor
>>> reactor._handleSignals()
>>> f=os.popen("sleep 5; echo 'Foo'"); f.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IOError: [Errno 4] Interrupted system call
The solution is easy: our SIGCHLD handler needs to be given a SA_RESTART flag. This will cause it to
not be handled until after the read call returns, as python does not allow the python sighandler code to
be run in the interrupt context. But that's okay.
There's only one minor problem: the python signal() wrapper doesn't provide the flags parameter that's
necessary in order to specify SA_RESTART. So, to fix this, someone should write a C extension module
that does something like this:
void add_signalflags(int sig, int moreflags)
{
  int res;
  struct sigaction act;
  res = sigaction(sig, NULL, &act);
  if (res < 0)
    RAISE ERROR
  act.sa_flags |= moreflags;
  res = sigaction(sig, &act, NULL);
  if (res < 0)
    RAISE ERROR
}

  2004-09-30 07:18:51+00:00 changed by glyph

OK, no useful feedback, but AWESOME for figuring this out, james.  Thanks.

  2004-09-30 07:19:32+00:00 changed by glyph

OK wait no, I changed my mind.  Will passing this flag still (properly)
interrupt select()?

  2004-09-30 07:42:52+00:00 changed by jknight

According to my documentation it will.
     If a signal is caught during the system calls listed below, the call may
     be forced to terminate with the error EINTR, the call may return with a
     data transfer shorter than requested, or the call may be restarted.
     Restart of pending calls is requested by setting the SA_RESTART bit in
     sa_flags.  The affected system calls include open(2), read(2), write(2),
     sendto(2), recvfrom(2), sendmsg(2) and recvmsg(2) on a communications
     channel or a slow device (such as a terminal, but not a regular file) and
     during a wait(2) or ioctl(2).  However, calls that have already committed
     are not restarted, but instead return a partial success (for example, a
     short read count).
May want to make *sure* by testing, though.

  2004-10-26 22:18:12+00:00 changed by itamarst

On Redhat 7.3 I had a hard time reproducing this, but:
>>> signal.alarm(1); f=os.popen("sleep 5; echo 'Foo'"); f.read()
reproduced it some of the time.

  2006-07-11 12:46:05+00:00 changed by ColinAlston

  • priority changed from high to highest
  • component set to core
  • milestone set to Twisted-2.5

Reproduced thanks to Gentoo Emerge and its crappy signaling

avail = os.popen('emerge -p --nocolor --nospinner %s | grep %s' % (pack, pack)) k = avail.read()

Something like that, put it anywhere you please, and you will intermitantly generate this bug on Twisted 2.0.1

  2006-07-11 13:57:51+00:00 changed by spiv

moshez made an interesting suggestion on IRC -- I wonder if we can implement the missing add_signalflags function with ctypes?

  2006-07-16 02:25:08+00:00 changed by exarkun

  • priority changed from highest to high
  • description deleted
  • milestone deleted

  2006-09-14 17:34:06+00:00 changed by exarkun

See also #791 and #1997

  2006-12-08 16:51:13+00:00 changed by titty

  • cc changed from glyph, exarkun, spiv, itamarst, jknight to glyph, exarkun, spiv, itamarst, jknight, titty

Feel free to use: http://systemexit.de/repo/sigalter It's a c module, which provides a set_SA_RESTART(SIGNUM) method.

Here's the code (in case the above server is unreachable):

#include <Python.h>
#include <signal.h>
static PyObject *
set_SA_RESTART (void *self, PyObject *args)
{
        struct sigaction sa;
        int signum;
        if (!PyArg_ParseTuple(args, "i:set_SA_RESTART", &signum)) {
                return NULL;
        }
        if (sigaction(signum, NULL, &sa)==-1) {
                goto error;
        }
        if (!(sa.sa_flags & SA_RESTART)) {
                sa.sa_flags |= SA_RESTART;
                if (sigaction(signum, &sa, NULL)==-1) {
                        goto error;
                }
        }
        return Py_BuildValue("");
error:
        PyErr_SetFromErrno(PyExc_OSError);
        return NULL;
}
static PyMethodDef module_function[] = {
        {"set_SA_RESTART", (PyCFunction)set_SA_RESTART, METH_VARARGS,
         "set_SA_RESTART(SIGNUM):\n   sets SA_RESTART flag on the signalhandler for signal SIGNUM"},
        { NULL,    NULL }
};
static char module_doc[] = "provide funtions to alter signal handling";
DL_EXPORT(void) initsigalter(void)
{
        Py_InitModule3("sigalter", module_function, module_doc);
}

  2007-01-21 14:53:05+00:00 changed by itamarst

  • owner changed from itamarst to jknight

  2007-10-23 22:44:37+00:00 changed by titty

the ctypes version (for linux):

import ctypes
import signal
libc = ctypes.CDLL("libc.so.6")
libc.siginterrupt(signal.SIGCHLD, 0)

  2008-02-01 17:07:06+00:00 changed by exarkun

  • branch deleted
  • author deleted

#791, #1997, and #2535 have significant overlap with this. Fixing any one of these (except for #2535, perhaps) almost certainly means fixing all of them.

The correct fix for most of them is:

  • Write a SIGCHLD handler in C. All it needs to do is write to an fd.
  • Install the SIGCHLD handler with the SA_RESTART flag. This makes it run when a child exits, but doesn't cause EINTR.
  • Include the fd in the reactor so that whenever it is written to, the reactor will wake up and reap dead children.

With this solution:

  • No more EINTR
  • No more race condition inside ceval.c

However, a custom SIGCHLD handler is still required, so #2535 isn't resolved.

follow-up: ↓ 16   2008-03-25 09:44:23+00:00 changed by titty

http://docs.python.org/dev/library/signal.html:

python 2.6 will have signal.siginterrupt and signal.set_wakeup_fd.

  2008-05-27 20:36:39+00:00 changed by glyph

  • owner changed from jknight to glyph
  • status changed from new to assigned

in reply to: ↑ 14   2008-09-12 00:20:03+00:00 changed by thijs

  • cc changed from glyph, exarkun, spiv, itamarst, jknight, titty to glyph, exarkun, spiv, itamarst, jknight, titty, thijs
  • launchpad_bug deleted

Replying to titty:

http://docs.python.org/dev/library/signal.html: python 2.6 will have signal.siginterrupt and signal.set_wakeup_fd.

Now a Python 2.6 buildslave available http://buildbot.twistedmatrix.com/builders/ubuntu64-python2.6-select/

Note: See TracTickets for help on using tickets.