Opened 9 years ago
Closed 9 years ago
#4352 defect closed fixed (fixed)
twisted.test.test_process.ProcessTestCase.testStdio and pollreactor don't get along
Reported by: | Jean-Paul Calderone | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | core | Keywords: | |
Cc: | jknight | Branch: |
branches/poll-stdio-halfclose-4352
branch-diff, diff-cov, branch-cov, buildbot |
Author: | exarkun |
Description
If the helper program used by testStdio
is changed to use pollreactor:
Index: twisted/test/process_twisted.py =================================================================== --- twisted/test/process_twisted.py (revision 28664) +++ twisted/test/process_twisted.py (working copy) @@ -10,6 +10,8 @@ sys.path.insert(0, os.curdir) ### end of preamble +from twisted.internet import pollreactor +pollreactor.install() from twisted.python import log from zope.interface import implements
then the test fails:
=============================================================================== [FAIL]: twisted.test.test_process.ProcessTestCase.testStdio Traceback (most recent call last): File "/home/exarkun/Projects/Twisted/trunk/twisted/test/test_process.py", line 501, in processEnded "%s\n" % (p.outF.getvalue(), p.errF.getvalue())) twisted.trial.unittest.FailTest: Output follows: Error message from process_twisted follows: 2010-03-05 22:46:52-0500 [-] Log opened. 2010-03-05 22:46:52-0500 [-] connection made 2010-03-05 22:46:52-0500 [-] connectionLost [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion. 2010-03-05 22:46:52-0500 [-] ] 2010-03-05 22:46:52-0500 [-] Main loop terminated. not equal: a = '' b = 'hello, worldabc123' -------------------------------------------------------------------------------
The child receives the bytes written to it, but apparently it shuts down the reactor too quickly and the parent process never receives the results.
I'm not sure if this is just a test bug or if it's actually a misbehavior of the reactor. The failing test is blocking #2234, though, which aims to make poll the default reactor on some platforms.
Change History (15)
comment:1 Changed 9 years ago by
Cc: | jknight added |
---|
comment:2 Changed 9 years ago by
Oops. I said Connection.doRead
takes care of turning an empty read into a ConnectionDone
. This is stdio, so actually it's ProcessReader.doRead
, which actually just delegates to twisted.internet.fdesc.readFromFD
. However, that does the same thing as Connection.doRead
, so the conclusion remains the same.
I also wonder why POLLHUP
comes back from the poll(2)
call. The man page says (output only)
for this flag. I suppose pipes are special, as usual.
comment:3 Changed 9 years ago by
Author: | → exarkun |
---|---|
Branch: | → branches/poll-stdio-halfclose-4352 |
(In [28665]) Branching to 'poll-stdio-halfclose-4352'
comment:5 Changed 9 years ago by
comment:6 Changed 9 years ago by
pollreactor, epollreactor, and gtk2reactor all had variations of this bug. I fixed them all in similar ways. Unfortunately there's a lot of duplication. :/
I expect to close #1742 when this branch is merged, too.
comment:7 Changed 9 years ago by
I'm kinda wondering if it's actually reasonable to have generic behavior in *poll reactors for all fd types, given how differently different object types behave.
comment:8 Changed 9 years ago by
That's an interesting line of consideration. It seems to make a lot of sense. If we moved this logic into the IReadDescriptor
and IWriteDescriptor
implementations, then the pipe-based implementations could be responsible for knowing about this quirk.
One challenge with making this change is that IReadDescriptor
and IWriteDescriptor
aren't expressive enough to allow the reactor to tell them about everything going on at the poll(2)
level. I'm also not exactly sure how we would widen this interface, except in the way which exactly accommodates this particular case. Hmm. Although since it works with SelectReactor
, maybe that's not really true. If we just called doRead
when poll told us POLLHUP
then maybe everything would work out okay...
On the other hand, it's clearly a bug in all of the affected reactor implementations that they were passing inRead = False
to _disconnectSelectable
in this case. So that part of the fix should remain.
comment:9 Changed 9 years ago by
Well, all the tests still pass with this change made to the branch:
-
twisted/internet/epollreactor.py
205 205 if event & _POLL_DISCONNECTED and not (event & _epoll.IN): 206 206 if fd in self._reads: 207 207 inRead = True 208 why = CONNECTION_DONE208 why = selectable.doRead() 209 209 else: 210 210 why = CONNECTION_LOST 211 211 else: -
twisted/internet/pollreactor.py
171 171 inRead = False 172 172 if event & POLL_DISCONNECTED and not (event & POLLIN): 173 173 if fd in self._reads: 174 why = main.CONNECTION_DONE175 174 inRead = True 175 why = selectable.doRead() 176 176 else: 177 177 why = main.CONNECTION_LOST 178 178 else: -
twisted/internet/gtk2reactor.py
278 278 inRead = False 279 279 if condition & POLL_DISCONNECTED and not (condition & gobject.IO_IN): 280 280 if source in self._reads: 281 why = main.CONNECTION_DONE282 281 inRead = True 282 why = source.doRead() 283 283 else: 284 284 why = main.CONNECTION_LOST 285 285 else:
but it's not much of a simplification. I might prefer to wait to make the change until there's a concrete benefit to doing so. I dunno.
comment:10 Changed 9 years ago by
Keywords: | review added |
---|---|
Owner: | Glyph deleted |
comment:11 Changed 9 years ago by
Keywords: | review removed |
---|---|
Owner: | set to Jean-Paul Calderone |
- Pyflakes:
twisted/test/stdio_test_halfclose.py:17: 'ConnectionDone' imported but unused
- the test file is also missing a couple of blank lines.
- one thing that the gtk2reactor code does better than the others is only set inRead to True, and keep the default value otherwise. It would be nice to change poll/epoll to do the same.
Otherwise it looks good. The duplication is unfortunate, but I don't want to think about that now. Please merge!
comment:12 Changed 9 years ago by
comment:13 Changed 9 years ago by
Thanks for the review. I fixed the first two points. For the third, I think this may actually be a bug in gtk2reactor. It's hard to tell, since no tests exercise it: but if a selectable is both readable and writeable, and the write signals an error, then probably inRead
should be set back to False
, because a write error must mean the connection is completely gone, not half-closed.
I'm going to leave the relevant code alone for now, but file a ticket for refactoring this duplication and determining whether this is really a bug or if the refactored implementation can do what gtk2reactor currently does.
comment:14 Changed 9 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:15 Changed 8 years ago by
Owner: | Jean-Paul Calderone deleted |
---|
So what the test does is basically this:
Meanwhile the child:
readConnectionLost
it callsloseConnection
on its transportconnectionLost
it callsreactor.stop()
When the child process uses select, this works because when stdin is closed,
SelectReactor._doReadOrWrite
in the child process knows that it is monitoring the file descriptor for read events. It passesTrue
forinRead
to_disconnectSelectable
. This, along with areason
ofConnectionDone
(which tcp.py correctly returns fromConnection.doRead
when it discovers a readable socket with no more bytes to read) results inreadConnectionLost
being called on the protocol, rather than merelyconnectionLost
.When the child process uses poll, this fails because when stdin is closed, it has only
POLLHUP
set andPollReactor._doReadOrWrite
doesn't pay attention to anything else when deciding whether to setinRead
toTrue
orFalse
. So it passesFalse
to_disconnectSelectable
. Plus, it treatsPOLLHUP
as an an error condition and usesConnectionLost
as thereason
, rather thanConnectionDone
. Each of these behaviors by itself is sufficient to cause_disconnectSelectable
to callconnectionLost
rather thanreadConnectionLost
. This results in the reactor being stopped earlier, presumably leading to the bytes being left in the Twisted send buffer rather than delivered to the parent.So this is a real pollreactor bug - stdio halfclose is not properly supported (#1742). I think it is not a test bug, because
transport.write
is supposed to be reliable (as much as possible given by the underlying transport), even if you calltransport.loseConnection
before the I/O has actually taken place.