Opened 7 years ago

Closed 7 years ago

#6796 defect closed wontfix (wontfix)

reactor.addReader(reader) should not expect reader to have a doWrite method (epollreactor)

Reported by: Grindaizer Owned by:
Priority: normal Milestone:
Component: core Keywords: reactor addreader filedescriptor epoll
Cc: Branch:
Author:

Description

According to documentation http://twistedmatrix.com/documents/13.1.0/api/twisted.internet.interfaces.IReactorFDSet.html

addReader method expect an IReadFileDescriptor.

But the epollreactor (for example) still call the "reader" for write event even if it's only added for reader

And in this case it will call the "doWrite" method of the "reader" object which will cause an attribute error exception.

ex

class WorkerConnection():
    #...

    def doRead(self):
        #some stuff


    def fileno(self):
        #...

    def connectionLost(self, reason):
        #...

    def logPrefix(self):
        return "WORKER-{0}: ".format(os.getpid())

reactor.addReader(WorkerConnection())
reactor.run()

for the code shown above in some case we will have the exception

builtins.AttributeError: 'WorkerConnection' object has no attribute 'doWrite'

Attachments (3)

dowrite_error.py (2.0 KB) - added by Grindaizer 7 years ago.
example of a doWrite called for a file descriptor only registered for reading
epollreactor.py (12.4 KB) - added by Grindaizer 7 years ago.
twisted.internet.epollreactor.py that logs fd being registered
epoll_multiprocess.py (3.5 KB) - added by Grindaizer 7 years ago.
example epoll in multiprocessing context

Download all attachments as: .zip

Change History (11)

comment:1 Changed 7 years ago by Grindaizer

tested with python 3.3.2 on twisted 13.1.0

comment:2 Changed 7 years ago by Jean-Paul Calderone

Resolution: worksforme
Status: newclosed

I turned the pseudo-code in the description into this runnable example:

from twisted.internet import epollreactor
epollreactor.install()

from twisted.internet import reactor

class WorkerConnection:
    def doRead(self):
        pass

    def fileno(self):
        return 1

    def connectionLost(self, reason):
        pass

    def logPrefix(self):
        return ""

from twisted.python.log import startLogging
from sys import stdout

startLogging(stdout)

reactor.addReader(WorkerConnection())
reactor.run()

I didn't observe any exceptions.

Please re-open with a complete example if you can reproduce the issue.

Changed 7 years ago by Grindaizer

Attachment: dowrite_error.py added

example of a doWrite called for a file descriptor only registered for reading

Changed 7 years ago by Grindaizer

Attachment: epollreactor.py added

twisted.internet.epollreactor.py that logs fd being registered

comment:3 Changed 7 years ago by Grindaizer

Cc: Grindaizer removed
Resolution: worksforme
Status: closedreopened

Hi again.

I put here attached a complete example that reproduce the problem.

The problem seems to appear in multiprocessing context and to really see what's happening I have added 2 logs into epollreactor.py that print the "fd" being registered or modified.
(the modified epollreactor.py attached here too).

This is the output.

(sanbox3.3) nacim@n4b5:~/Projects/twisted_web_application/app$ python dowrite_error.py 
2013-10-23 10:25:21+0200 [-] Log opened.
2013-10-23 10:25:21+0200 [-] [12165] Start Process Worker
2013-10-23 10:25:21+0200 [-] [12165] Register FD=7 flags=1  (primary={4: 1}, other={}) -- FileDesciptor Class=<class '__main__.WorkerConnection'>
2013-10-23 10:25:21+0200 [-] [12166] Start Process Worker
2013-10-23 10:25:21+0200 [-] [12166] Register FD=10 flags=1  (primary={4: 1}, other={}) -- FileDesciptor Class=<class '__main__.WorkerConnection'>
2013-10-23 10:25:21+0200 [-] [12165] Register FD=10 flags=1  (primary={4: 1, 7: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.tcp.Client'>
2013-10-23 10:25:21+0200 [-] [12165] Modify   FD=10 flags=5  (primary={}, other={10: 1, 4: 1, 7: 1}) -- FileDesciptor Class=<class 'twisted.internet.tcp.Client'>
2013-10-23 10:25:21+0200 [WORKER-12166: ] [12166] XXX this should not be called !! XXX
2013-10-23 10:25:21+0200 [Uninitialized] [12165] Register FD=10 flags=1  (primary={4: 1, 7: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.tcp.Client'>
2013-10-23 10:25:21+0200 [-] [12165] create proto (fd=10)


^C2013-10-23 10:25:22+0200 [-] Received SIGINT, shutting down.
2013-10-23 10:25:22+0200 [-] Received SIGINT, shutting down.
2013-10-23 10:25:22+0200 [-] Received SIGINT, shutting down.
2013-10-23 10:25:22+0200 [-] Main loop terminated.
2013-10-23 10:25:22+0200 [-] Main loop terminated.
2013-10-23 10:25:22+0200 [-] Main loop terminated.

And this is what I've observed:

  • process MAIN fork 2 subprocess let's say A and B.
  • B has a fd=10 registred only for reader using reactor.addReader and thus the FileDescriptor attached does not need to have doWrite method).
  • if A starts a connection to do something, and it happens that the OS gives this connection the same fd number (=10) than the one registred in B.
  • A will register then fd=10 for READ(IN) and WRITE(OUT) event.

==> And now for a reason that still opaque for me, when process A write something to fd(=10 here), a write event appears in process B for that same fd(=10) even though in process B this fd was only registered for READ events.

comment:4 Changed 7 years ago by Grindaizer

Cc: Grindaizer added

comment:5 Changed 7 years ago by Grindaizer

Keywords: filedescriptor epoll added; diledescriptor removed

comment:6 Changed 7 years ago by Jean-Paul Calderone

Thanks. I'm still not able to produce the problem - either with the attached epollreactor or with epollreactor from trunk@HEAD. Here's the output I get:

exarkun@top:/tmp$ python dowrite_error.py 
2013-10-23 07:26:10-0400 [-] Log opened.
2013-10-23 07:26:10-0400 [-] [31736] Register FD=10 flags=1  (primary={4: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.posixbase._SIGCHLDWaker'>
2013-10-23 07:26:10-0400 [-] [31737] Start Process Worker
2013-10-23 07:26:10-0400 [-] [31737] Register FD=7 flags=1  (primary={4: 1}, other={}) -- FileDesciptor Class=<class __main__.WorkerConnection at 0x2c5f738>
2013-10-23 07:26:10-0400 [-] [31737] Register FD=10 flags=1  (primary={4: 1, 7: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.posixbase._SIGCHLDWaker'>
2013-10-23 07:26:10-0400 [-] [31738] Start Process Worker
2013-10-23 07:26:10-0400 [-] [31738] Register FD=9 flags=1  (primary={4: 1}, other={}) -- FileDesciptor Class=<class __main__.WorkerConnection at 0x2c5f738>
2013-10-23 07:26:10-0400 [-] [31738] Register FD=11 flags=1  (primary={9: 1, 4: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.posixbase._SIGCHLDWaker'>
2013-10-23 07:26:10-0400 [-] [31737] Register FD=9 flags=1  (primary={10: 1, 4: 1, 7: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.tcp.Client'>
2013-10-23 07:26:10-0400 [-] [31737] Modify   FD=9 flags=5  (primary={}, other={9: 1, 10: 1, 4: 1, 7: 1}) -- FileDesciptor Class=<class 'twisted.internet.tcp.Client'>
2013-10-23 07:26:10-0400 [Uninitialized] [31737] Register FD=9 flags=1  (primary={10: 1, 4: 1, 7: 1}, other={}) -- FileDesciptor Class=<class 'twisted.internet.tcp.Client'>
2013-10-23 07:26:10-0400 [-] [31737] create proto (fd=9)
^C2013-10-23 07:26:12-0400 [-] Received SIGINT, shutting down.
2013-10-23 07:26:12-0400 [-] Main loop terminated.
2013-10-23 07:26:12-0400 [-] Received SIGINT, shutting down.
2013-10-23 07:26:12-0400 [-] Received SIGINT, shutting down.
2013-10-23 07:26:12-0400 [-] Main loop terminated.
2013-10-23 07:26:12-0400 [-] Main loop terminated.
exarkun@top:/tmp$ uname -a
Linux top 3.2.0-4-amd64 #1 SMP Debian 3.2.41-2 x86_64 GNU/Linux
exarkun@top:/tmp$ 

comment:7 Changed 7 years ago by Grindaizer

Cc: Grindaizer removed

Hi,

I admit that it's a bite hard to reproduce even locally for me. The fd number that you get from OS matter, so adding any connection can make the test script to fail.

So can you re run the test script without having the "twisted.internet.posixbase._SIGCHLDWaker", and can you precise your python/twisted version. 13.1.0.

Anyway, I wanted to check if this in really twisted specific so I have written another test that do almost the same thing as my attached example above, but without using twisted. And I still have the problem.

test script without twisted is attached here too, and this is the output

(sanbox3.3) nacim@n4b5:~/Projects/twisted_web_application/app$ python epoll_multiprocess.py 
[ROOT] Register fd=4 flags=5 -- fd <socket.socket object, fd=4, family=1, type=2049, proto=0>
[WORKER 1] Register fd=5 flags=1 -- fd <socket.socket object, fd=5, family=1, type=2049, proto=0>
[ROOT] Register fd=7 flags=5 -- fd <socket.socket object, fd=7, family=1, type=2049, proto=0>
[WORKER 2] XXX WORKER 2 READ ONLY FD=8 XXX
[WORKER 2] Register fd=8 flags=1 -- fd <socket.socket object, fd=8, family=1, type=2049, proto=0>
[WORKER 1] XXX WORKER 1 READWRITE FD=8 XXX
[WORKER 1] Register fd=8 flags=5 -- fd <socket.socket object, fd=8, family=2, type=2049, proto=0>
[WORKER 1] XXX WORKER 1 READWRITE FD=9 XXX
[WORKER 1] Register fd=9 flags=5 -- fd <socket.socket object, fd=9, family=2, type=2049, proto=0>
[WORKER 1] XXX WORKER 1 READWRITE FD=8 XXX
[WORKER 1] Register fd=8 flags=5 -- fd <socket.socket object, fd=8, family=2, type=2049, proto=0>
[WORKER 2] XXX this should not happen (events=[(4, 4), (7, 4), (8, 4)])XXX

In my test, I use the epoll object that was "cloned" from the ROOT process. If I instantiate a new epoll object in each WORKER I won't have the problem.

This make me think that may be the epoll object keep sharing some state even after forking !!

If so this will make the epoll reactor in twisted hard to use in multiprocessing context as twisted instantiate the reactor (and thus the epoll object) only once, making the the forked process use the "cloned" one.!

Changed 7 years ago by Grindaizer

Attachment: epoll_multiprocess.py added

example epoll in multiprocessing context

comment:8 Changed 7 years ago by therve

Resolution: wontfix
Status: reopenedclosed

It's definitely a multiprocessing issue. Indeed, the epoll object is inherited after fork, because of multiprocessing (you share data by default). You can work around the issue maybe a bit by overriding what's in sys.modules['twisted.internet.reactor']. The only way to work around it in Twisted itself would be to not have a global reactor object, but it's not the right place to discuss it, I'd say. It's also a fair amount of work.

In the mean time, I'd suggest rewriting your code to use spawnProcess instead of multiprocessing.

Note: See TracTickets for help on using tickets.