[Twisted-Python] connectionLost never reached after calling loseConnection: stuck in CLOSE_WAIT forever
ste at demaledetti.net
Sat Oct 30 12:48:45 EDT 2010
Hello Jean-Paul, thanks for tracking this down, you rock!
I promise that when I'll have payed all by debts I'll buy one of
those posters of Exarkun to hang on my wall!
> After a few runs, I managed to reproduce the problem. I instrumented the reactor with some extra logging and test_producer.py with a manhole server.
> The sequence of events appears to be something like this:
> OneA has a producer of OneE
> OneA has a consumer of OneB
> At some point OneB gives up and tells OneA to stopProducing (loseConnection)
> OneA.loseConnection stops the reactor from reading OneA and starts it writing
> OneA.doWrite happens
> it finds the send buffer empty
> it finds a registered producer (OneE) and resumes it
> OneE never produces any more bytes
> OneE loses its connection at some point and unregisters itself from OneA
> OneA takes note that it has no more producer, but does nothing about it
> So the bug is likely that FileDescriptor.unregisterProducer doesn't do anything special when disconnecting=True.
> You should be able to reproduce this very simply by setting up a transport-producer/consumer pair, calling loseConnection on the transport, then unregistering the producer.
> This all sounds somewhat familiar, but I don't see an existing ticket for it, so maybe that's my imagination.
Following your indications I attached a minimal example to a new ticket:
Some additional info:
* problem occurs only if more than 64KB of data are written to the
transport before its consumer calls stopProducing on it
* problem occurs only if some time passes before its producer
unregisters itself from the transport
Thanks again for your help! :)
More information about the Twisted-Python