[Twisted-Python] help with ssl timeout and not reconnecting client factory

Andrea Arcangeli andrea at cpushare.com
Thu Mar 17 07:40:56 EST 2005


>From a client I'm getting this error:

2005/03/16 10:42 CET [cpushare_protocol,client] 'limit sell'
2005/03/17 06:31 CET [cpushare_protocol,client] Traceback (most recent call last):
          File "/usr/lib64/python2.3/site-packages/twisted/python/log.py", line 65, in callWithLogger
            callWithContext({"system": lp}, func, *args, **kw)
          File "/usr/lib64/python2.3/site-packages/twisted/python/log.py", line 52, in callWithContext
            return context.call({ILogContext: newCtx}, func, *args, **kw)
          File "/usr/lib64/python2.3/site-packages/twisted/python/context.py", line 43, in callWithContext
            return func(*args,**kw)
          File "/usr/lib64/python2.3/site-packages/twisted/internet/pollreactor.py", line 160, in _doReadOrWrite
            why = selectable.doRead()
        --- <exception caught here> ---
          File "/usr/lib64/python2.3/site-packages/twisted/internet/tcp.py", line 98, in doRead
            return Connection.doRead(self)
          File "/usr/lib64/python2.3/site-packages/twisted/internet/tcp.py", line 239, in doRead
            data = self.socket.recv(self.bufferSize)
        OpenSSL.SSL.SysCallError: (110, 'Connection timed out')
        
2005/03/17 06:31 CET [cpushare_protocol,client] <twisted.internet.ssl.Connector instance at 0x2aaaac28d950> will retry in 2 s
econds
2005/03/17 06:31 CET [cpushare_protocol,client] Stopping factory <cpushare.proto.cpushare_factory instance at 0x2aaaad414290>
2005/03/17 06:31 CET [-] Starting factory <cpushare.proto.cpushare_factory instance at 0x2aaaad414290>
2005/03/17 06:31 CET [-] Connection failed. Reason: [Failure instance: Traceback: twisted.internet.error.TimeoutError, User t
imeout caused connection failure.
2005/03/17 06:31 CET [-] ]
2005/03/17 06:31 CET [-] Stopping factory <cpushare.proto.cpushare_factory instance at 0x2aaaad414290>

This is a reconnecting client factory, the python version is 2.3.4 and twisted
version is 1.3.0a. The socket should sit in a idle state. No communication over
that socket will happen (the app is under development), but it should not go in
timeout unless the connection with the server ends and the keepalive events
triggers a disconnect (I enabled keepalive on the tcp level).

Even if it goes in timeout it must try to reconnect immediatly while it seems
like it's hanging after the "Stopping factory".


Earlier when I got the connection timed out event (for no apparent good reason)
at least it was immediatly trying to reconnect:

2005/03/14 05:30 CET [cpushare_protocol,client] 'limit sell'
2005/03/14 17:59 CET [cpushare_protocol,client] Traceback (most recent call last):
          File "/usr/lib64/python2.3/site-packages/twisted/python/log.py", line 65, in callWithLogger
            callWithContext({"system": lp}, func, *args, **kw)
          File "/usr/lib64/python2.3/site-packages/twisted/python/log.py", line 52, in callWithContext
            return context.call({ILogContext: newCtx}, func, *args, **kw)
          File "/usr/lib64/python2.3/site-packages/twisted/python/context.py", line 43, in callWithContext
            return func(*args,**kw)
          File "/usr/lib64/python2.3/site-packages/twisted/internet/pollreactor.py", line 160, in _doReadOrWrite
            why = selectable.doRead()
        --- <exception caught here> ---
          File "/usr/lib64/python2.3/site-packages/twisted/internet/tcp.py", line 98, in doRead
            return Connection.doRead(self)
          File "/usr/lib64/python2.3/site-packages/twisted/internet/tcp.py", line 239, in doRead
            data = self.socket.recv(self.bufferSize)
        OpenSSL.SSL.SysCallError: (110, 'Connection timed out')
        
2005/03/14 17:59 CET [cpushare_protocol,client] <twisted.internet.ssl.Connector instance at 0x2aaaac28d950> will retry in 2 s
econds
2005/03/14 17:59 CET [cpushare_protocol,client] Stopping factory <cpushare.proto.cpushare_factory instance at 0x2aaaad414290>
2005/03/14 17:59 CET [-] Starting factory <cpushare.proto.cpushare_factory instance at 0x2aaaad414290>


So my first priority is to understand why it stopped trying to reconnect (which
is the major bug) and the second priority is to understand why it was going in
timeout in the first place. (I can't exclude there have been a temporary network
disruption that caused the keepalive to trigger the disconnect.)

Could this be a bug in 1.3.0a? I expect the client will be mostly run with
1.3.0a, only on the server side I use SVN + pending fixes.

This is the reconnecting code:

class cpushare_factory(ReconnectingClientFactory):
	maxDelay = 600 # limit the maximum delay to 10 min

	protocol = cpushare_protocol

	def buildProtocol(self, addr):
		self.resetDelay()
		protocol = self.protocol()
		assert not hasattr(protocol, 'factory')
		protocol.factory = self
		return protocol

	def clientConnectionFailed(self, connector, reason):
		print 'Connection failed. Reason:', reason
		ReconnectingClientFactory.clientConnectionFailed(self, connector, reason)

	def connectionMade(self):
		self.transport.setTcpKeepAlive(1)

Is the above correct? It works fine when the connection failed reason is
"ConnectionRefusedError" instead of TimeoutError.


What else should I do to prevent this error to leave the factory stopped?

2005/03/17 06:31 CET [-] Connection failed. Reason: [Failure instance: Traceback: twisted.internet.error.TimeoutError, User t
imeout caused connection failure.

Where does the "twisted.internet.error.TimeoutError" come from?


The full source is LGPL and downloadable here:

	https://www.cpushare.com/downloads/cpushare-0.11.tar.bz2

The server side logs are absolutely not-interesting, all I get is a
connectionLost event without any apparent error (server side is SVN-trunk).

Any help is greatly appreciated, thanks!




More information about the Twisted-Python mailing list