Opened 11 years ago

Closed 11 years ago

#4846 defect closed fixed (fixed)

In ReconnectingClientFactory calling stopTrying will sometimes trigger a retry on clientConnectionFailed

Reported by: rlotun Owned by:
Priority: normal Milestone:
Component: core Keywords:
Cc: Branch: branches/rcf-retry-4846
branch-diff, diff-cov, branch-cov, buildbot
Author: rlotun, exarkun

Description

In our buildbot we noticed the following traceback when running our unittests:

  factory.stopTrying()
      File "/usr/lib/python2.6/dist-packages/twisted/internet/protocol.py", line 394, in stopTrying
    self.connector.stopConnecting()
      File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1031, in stopConnecting
    self.transport.failIfNotConnected(error.UserError())
      File "/usr/lib/python2.6/dist-packages/twisted/internet/tcp.py", line 582, in failIfNotConnected
    self.connector.connectionFailed(failure.Failure(err))
      File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1051, in connectionFailed
    self.factory.clientConnectionFailed(self, reason)
      File "/home/ubuntu/buildbot/slave-ec2-XX-XXX-XX-XX.compute-1.amazonaws.com/rlotun-api-tdweb-removal/build/tdapi/redis.py", line 32, in clientConnectionFailed
    RedisClientFactory.clientConnectionFailed(self, connector, reason)
      File "/usr/lib/python2.6/dist-packages/twisted/internet/protocol.py", line 334, in clientConnectionFailed
    self.retry()
      File "/usr/lib/python2.6/dist-packages/twisted/internet/protocol.py", line 379, in retry
    self._callID = self.clock.callLater(self.delay, reconnector)

As part of our tearDown we were calling stopTrying on a ReconnectingClientFactory. After examining the code below it seems that if a connector is active stopConnecting will be called, which would trigger a clientConnectionFailed which would then trigger a retry while continueTrying is still set to 1. A simple fix of moving the self.continueTrying = 0 line up before calling stopConnecting seems to do the trick.

Offending code block:

    def stopTrying(self):
        """
        Put a stop to any attempt to reconnect in progress.
        """
        # ??? Is this function really stopFactory?
        if self._callID:
            self._callID.cancel()
            self._callID = None
        if self.connector:
            # Hopefully this doesn't just make clientConnectionFailed
            # retry again.
            try:
                self.connector.stopConnecting()
            except error.NotConnectingError:
                pass
        self.continueTrying = 0

Attached is a patch which includes a test to reproduce the issue.

Attachments (1)

rcf_stoptrying.diff (2.3 KB) - added by rlotun 11 years ago.
fix and unit test

Download all attachments as: .zip

Change History (12)

Changed 11 years ago by rlotun

Attachment: rcf_stoptrying.diff added

fix and unit test

comment:1 Changed 11 years ago by Jean-Paul Calderone

Owner: set to Jean-Paul Calderone
Status: newassigned

comment:2 Changed 11 years ago by Jean-Paul Calderone

Keywords: review removed

Thanks! This one was tricky :/ The fix looks right, but here are some comments mainly about the tests:

  1. There should be a news fragment.
  2. Copyright date on modified files should be updated to include 2011.
  3. Methods should be separated by each other by two blank lines
  4. attempted_retry should be attemptedRetry to follow the variable naming convention
  5. assertEquals(x, 0) is better than assertTrue(x == 0) since it will display the value of x if the assertion fails
  6. The test could avoid overriding any ReconnectingClientFactory methods if it overrode connect on the fake connector instead. This would also ensure that the connect method isn't called by any other part of ReconnectingClientFactory.

These are mostly simple, so I'll implement them. The last one is pretty easy too (now that I've actually managed to understand what's going on with RFC ;), so I can implement that too, but involved enough that I'm going to ask someone else to review it too.

comment:3 Changed 11 years ago by Jean-Paul Calderone

Author: exarkun
Branch: branches/rcf-retry-4846

(In [30602]) Branching to 'rcf-retry-4846'

comment:4 Changed 11 years ago by Jean-Paul Calderone

(In [30603]) Apply rcf_stoptrying.diff

refs #4846

comment:5 Changed 11 years ago by Jean-Paul Calderone

(In [30604]) Adjust test, removing ReconnectingClientFactory subclass and fixing some other style issues

refs #4846

comment:6 Changed 11 years ago by Jean-Paul Calderone

(In [30607]) Add some docstrings to make the test a little easier to understand; also delete a Hopeful but naive comment which is no longer relevant.

refs #4846

comment:7 Changed 11 years ago by Jean-Paul Calderone

Keywords: review added
Owner: Jean-Paul Calderone deleted
Status: assignednew

I did those things I said. Build results

comment:8 Changed 11 years ago by Jean-Paul Calderone

Author: exarkunrlotun, exarkun

comment:9 Changed 11 years ago by Glyph

Keywords: review removed
Owner: set to Jean-Paul Calderone

The branch looks good to me. Please land.

comment:10 Changed 11 years ago by Jean-Paul Calderone

Resolution: fixed
Status: newclosed

(In [30609]) Merge rcf-retry-4846

Author: rlotun, exarkun Reviewer: exarkun, glyph Fixes: #4846

Fix a bug in ReconnectingClientFactory which would cause it to sometimes re-try a connection attempt after its stopTrying method was called.

comment:11 Changed 11 years ago by <automation>

Owner: Jean-Paul Calderone deleted
Note: See TracTickets for help on using tickets.