Opened 6 years ago

Last modified 5 years ago

#3529 defect new

closing stdout in a child process on cygwin means that process doesn't receive bytes from stdin anymore. I think.

Reported by: zooko Owned by: zooko
Priority: normal Milestone:
Component: core Keywords:
Cc: exarkun Branch:
Author: Launchpad Bug:

Description

When someone runs bin/trial --reactor=select twisted.internet.test.test_process.ProcessTestsBuilder_PollReactor.test_childConnectionLost on cygwin, using the cygwin version of Python, either manually (which I've done on two cygwin installations), or from the twisted buildbot -- http://buildbot.twistedmatrix.com/builders/cygwin-py2.5-select -- then once the child process closes its fd #2, it subsequently receives no further bytes from its fd #1, even though the parent process has written some bytes to that pipe.

I can reproduce this by running the Twisted unit tests, like this:

step 1: get twisted:

svn co svn://svn.twistedmatrix.com/svn/Twisted/trunk twisted

step 2: cd into the top-level directory, and run trial:

cd twisted && python ./bin/trial --reactor=select  twisted.internet.test.test_process.ProcessTestsBuilder_PollReactor.test_childConnectionLost

On other platforms that cygwin (see the Twisted buildbot -- http://buildbot.twistedmatrix.com/supported?branch=trunk -- then this test exits quickly with success. On cygwin, using cygwin python.exe, this test instead enter an infinite loop as the child waits to receive further bytes on its stdin, and the parent waits for the child to respond to those bytes. (This doesn't happen on cygwin with the standard Win32 build of python.exe, of course, anymore than it happens on a Windows system without cygwin installed when using that executable.)

Attachments (4)

demo.py (1.3 KB) - added by exarkun 6 years ago.
attempt at minimal example
demo.2.py (1.5 KB) - added by exarkun 6 years ago.
second attempt to reproduce the problem
demo.3.py (1.6 KB) - added by exarkun 6 years ago.
attempt which includes execvpe
demo.4.py (1.6 KB) - added by exarkun 6 years ago.
switch to stdio calls for talking to stdin in the child process

Download all attachments as: .zip

Change History (24)

comment:1 Changed 6 years ago by zooko

This issue is blocking #2234.

comment:2 Changed 6 years ago by zooko

I've opened a ticket for this on the python issue tracker as well:

http://bugs.python.org/issue4295 # closing stdout in a child process on cygwin means that process doesn't receive bytes from stdin anymore. I think.

comment:3 Changed 6 years ago by zooko

I opened a ticket on the cygwin issue tracker:

http://sourceware.org/bugzilla/show_bug.cgi?id=7017 # closing stdout in a child python process means that process doesn't receive bytes from stdin

comment:4 Changed 6 years ago by exarkun

  • Cc exarkun added

comment:5 Changed 6 years ago by zooko

Corinna Vinschen of cygwin requests a smaller test case:

http://www.cygwin.com/ml/cygwin/2008-11/msg00166.html

comment:6 Changed 6 years ago by exarkun

Can you try the attached on cygwin and report its behavior?

Changed 6 years ago by exarkun

attempt at minimal example

comment:7 Changed 6 years ago by exarkun

  • Owner changed from glyph to zooko

comment:8 Changed 6 years ago by zooko

Unfortunately (?) its behavior is to exit cleanly after emitting:

$ /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave/Desktop/demo.py 
(400, 0)

Just to make sure I wasn't hallucinating, I ran the unit test again:

$ /usr/bin/python ./bin/trial --reactor=select  twisted.internet.test.test_process.ProcessTestsBuilder_PollReactor.test_childConnectionLost
twisted.internet.test.test_process
  ProcessTestsBuilder_PollReactor
    test_childConnectionLost ... 

Yep, it still hangs.

comment:9 Changed 6 years ago by exarkun

How about this one?

Changed 6 years ago by exarkun

second attempt to reproduce the problem

comment:10 Changed 6 years ago by zooko

$ /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave/Desktop/demo2.py                 (2992, 0)

Changed 6 years ago by exarkun

attempt which includes execvpe

comment:11 Changed 6 years ago by exarkun

Any luck with version 3?

comment:12 Changed 6 years ago by zooko

Too bad:

$ time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave/Desktop/demo3.py 
(224, 0)

real    0m0.941s
user    0m0.110s
sys     0m0.350s

buildslave@tahoe-winslave ~/playground/twisted
$ time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave/Desktop/demo3.py 
(516, 0)

real    0m0.741s
user    0m0.140s
sys     0m0.410s

buildslave@tahoe-winslave ~/playground/twisted
$ time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave/Desktop/demo3.py 
(2288, 0)

real    0m5.087s
user    0m0.130s
sys     0m0.420s

I added "time" and ran it several times to show you that demo3.py occasionally takes 5 or 6 seconds on this machine. I think demo2.py and demo.py did, too.

comment:13 Changed 6 years ago by exarkun

Hm. Not sure what the delays are about. Might be worth investigation too. Probably not related to this issue, though.

Attaching another version.

Changed 6 years ago by exarkun

switch to stdio calls for talking to stdin in the child process

comment:14 Changed 6 years ago by zooko

Probably totally useless facts:

$ for I in $(seq 0 100) ; do ( time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave                                            /Desktop/demo.py ) 2>&1 | grep ^real | cut -d'm' -f2- ; done | sort -g | tail
0.630s
0.640s
0.640s
0.640s
0.650s
0.650s
0.680s
0.680s
0.721s
0.721s

buildslave@tahoe-winslave ~/playground/twisted
$ for I in $(seq 0 100) ; do ( time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave                                            /Desktop/demo2.py ) 2>&1 | grep ^real | cut -d'm' -f2- ; done | sort -g | tail
0.630s
0.630s
0.640s
0.640s
0.640s
0.640s
0.660s
0.670s
0.670s
0.751s

buildslave@tahoe-winslave ~/playground/twisted
$ for I in $(seq 0 100) ; do ( time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave                                            /Desktop/demo3.py ) 2>&1 | grep ^real | cut -d'm' -f2- ; done | sort -g | tail
0.771s
0.771s
0.771s
0.781s
0.781s
0.781s
0.781s
0.801s
0.801s
0.981s

comment:15 Changed 6 years ago by zooko

$ time /usr/bin/python /cygdrive/c/Documents\ and\ Settings/buildslave/Desktop/demo4.py 
(2268, 0)

comment:16 Changed 6 years ago by zooko

#3398 was a duplicate of this.

comment:17 Changed 6 years ago by zooko

I've updated http://sourceware.org/bugzilla/show_bug.cgi?id=7017 to show that we tried and failed to make a smaller test case which exhibits the same problem. (And to re-iterate that the problem is reproducible using trial and the twisted unit tests, although I doubt that the cygwin maintainers are going to start digging into those in search of this bug.)

comment:18 Changed 6 years ago by zooko

I ran into a similar issue again in the tahoe unit tests (http://allmydata.org/buildbot/builders/cygwin/builds/1735 ), and boiled it down to this test code:

from twisted.internet import reactor
from twisted.internet import utils
import os

executable = "c://cygwin/bin/true"

x = [False]

def test():
    if not x[0]:
        print "_cb hasn't been called yet!?"

def subp():
    d = utils.getProcessOutputAndValue(executable, env=os.environ)
    print "started"
    def _cb(res):
        print "first cb"
        out, err, rc_or_sig = res
        if out != "":
            print "WHAT", out
        if rc_or_sig != 0:
            print "WHAT2", rc_or_sig
        x[0] = True
    d.addCallback(_cb)
    return d

reactor.callLater(0, subp)
reactor.callLater(1, test)
reactor.run()

I would expect this to print "started" and then "first cb", and this is what it does on Mac OS X, but on Cygwin it does this:

$ python ./mintestrunner.py
started
_cb hasn't been called yet!?

And then waits until you hit C-c, at which point it adds:

first cb

comment:19 Changed 6 years ago by exarkun

Unfortunately that's not much of a "boils down". Reproducing this without involving Twisted is probably the only useful next step.

comment:20 Changed 5 years ago by zooko

Daniel Stutzbach tried to reproduce this problem with just the Python subprocess module on Python 2.5 and couldn't reproduce it: http://bugs.python.org/issue4295

Perhaps it is Twisted-specific.

It has been sufficiently long since I last reproduced this issue that a useful next step would be to try reproducing it with the latest versions of Twisted and cygwin.

Unfortunately I do not currently have access to a cygwin machine.

Note: See TracTickets for help on using tickets.