[Twisted-Python] connectionLost never reached after calling loseConnection: stuck in CLOSE_WAIT forever

Stefano Debenedetti ste at demaledetti.net
Sat Oct 30 12:48:45 EDT 2010


Hello Jean-Paul, thanks for tracking this down, you rock!

I promise that when I'll have payed all by debts I'll buy one of
those posters of Exarkun to hang on my wall!

JP wrote:
> After a few runs, I managed to reproduce the problem.  I instrumented the reactor with some extra logging and test_producer.py with a manhole server.
> The sequence of events appears to be something like this:
> 
>  OneA has a producer of OneE
>  OneA has a consumer of OneB
>  At some point OneB gives up and tells OneA to stopProducing (loseConnection)
>  OneA.loseConnection stops the reactor from reading OneA and starts it writing
>  OneA.doWrite happens
>    it finds the send buffer empty
>    it finds a registered producer (OneE) and resumes it
>  OneE never produces any more bytes
>  OneE loses its connection at some point and unregisters itself from OneA
>  OneA takes note that it has no more producer, but does nothing about it
> 
> So the bug is likely that FileDescriptor.unregisterProducer doesn't do anything special when disconnecting=True.
> 
> You should be able to reproduce this very simply by setting up a transport-producer/consumer pair, calling loseConnection on the transport, then unregistering the producer.
> 
> This all sounds somewhat familiar, but I don't see an existing ticket for it, so maybe that's my imagination.
> 
> Jean-Paul


Following your indications I attached a minimal example to a new ticket:

http://twistedmatrix.com/trac/ticket/4719

Some additional info:

* problem occurs only if more than 64KB of data are written to the
transport before its consumer calls stopProducing on it

* problem occurs only if some time passes before its producer
unregisters itself from the transport

Thanks again for your help! :)
ste





More information about the Twisted-Python mailing list