process hangs (was Re: [Twisted-Python] Re: [Twisted-commits] r11107 - Delete fundamentally broken test.)

Dave Peticolas dave at krondo.com
Thu Jul 29 21:39:14 MDT 2004


On Fri, 2004-07-23 at 07:07, Andrew Bennetts wrote:
> On Fri, Jul 23, 2004 at 09:38:56AM -0400, James Y Knight wrote:
> > Wait, wait, that causes *hangs*? That seems like a bad thing. It 
> > doesn't look like an obviously wrong thing to do to me. Do you know 
> > *why* it's hanging?
> 
> I'm not sure why it's hanging, and I'd be happy for someone to figure out
> why.  Ideally they'd fix the problem too, if there is one.
> 
> My suspicion is that the bug is in that test, not in Twisted, though.  The
> process_pausing.py script itself is far too ugly to have any confidence in.
> It tries to detect that writes to stdout block by looking at times, which is
> extremely fragile.  Worse, detecting that writing to stdout blocks doesn't
> necessarily prove anything anyway: the intention (presumably, the test has
> no comments) is apparently to test that pauseProducing on a transport will
> cause pipes from a child process to be unread and hence let the buffers
> fill.  But the child process could just as easily be finding that the writes
> are blocking because it's simply writing more frequently than the parent is
> reading, e.g. due to scheduling anomalies... 
> 
> I'm also not aware of any real world reports of problems with the process
> code hanging, despite the test being pretty prone to intermittent failure,
> which is also highly suggestive that the test is broken, not the code.

I have a somewhat annoying problem related to the process code,
though possibly not caused by it. I have a script that is managing large
numbers of processes (sometimes hundreds, over time) and occasionally a
process will manage to exit and twisted's process code doesn't get the
waitpid info for it, but instead gets the ECHILD (no such child) system
error. In that case, twisted will keep trying to reap the process and
will never figure out the process is gone.

This is on a Redhat 7.2 system using python2.3 and a newish version
of twisted. I don't know why the process seems to get lost, but it
would be nice if Twisted would at least notice the ECHILD and signal
process termination (or lost, or something).

Has anyone else experienced this problem?

thanks,
dave






More information about the Twisted-Python mailing list