Opened 5 years ago

Last modified 4 months ago

#4735 enhancement new

Implement something like ReconnectingClientFactory for endpoints

Reported by: glyph Owned by: hodgestar
Priority: high Milestone:
Component: core Keywords:
Cc: brandl, keturn@…, free@…, twistedmatrix.com@…, hodgestar, hs@…, mithrandi@…, adi@… Branch: branches/persistent-client-service-4735-4
(diff, github, buildbot, log)
Author: hodgestar, dreid, tomprince, glyph Launchpad Bug:

Description

It's pretty easy to write a retry loop which will keep attempting an endpoint connection until it succeeds; this is mostly a documentation issue. However, another use-case for ReconnectingClientFactory is to keep a connection up after disconnections even if the initial connection succeeds just fine and only disconnects later; in other words, to react to clientConnectionLost vs. clientConnectionFailed. Implementing this would be easier if it were a little easier to cheaply dynamically proxy a protocol object to capture the connectionLost notification, but it's definitely possible to do this today with some care. We should provide such a utility so that people using the new endpoint APIs have a convenient mechanism available. (As we implement this, we should take care not to make the same naming and documentation mistakes that ReconnectingClientFactory does.)

Change History (63)

comment:1 follow-up: Changed 5 years ago by exarkun

it's definitely possible to do this today with some care

Mind sketching out the API you have in mind?

comment:2 in reply to: ↑ 1 Changed 5 years ago by glyph

Replying to exarkun:

it's definitely possible to do this today with some care

Mind sketching out the API you have in mind?

The thing I meant that it's "possible to do" is simply to write a wrapper for IProtocol which reliably captures connectionLost but doesn't actually get in the way of operation.

The top-level API I have in mind really just looks like this:

from twisted.application.internet import ClientKeeperUpper
ClientKeeperUpper(someClientEndpoint, someFactory[, reactor]).startService()

Possibly with some extra keyword arguments to control reconnection delays etc. stopService would loseConnection on the active connection, return a Deferred that fires when connectionLost gets called for that connection, and stop attempting to reconnect. Also, hopefully obviously, name subject to change.

comment:3 follow-up: Changed 5 years ago by exarkun

A couple considerations:

  1. Services are great, but you don't always have an active application/service hierarchy when you want to do some reconnection. Not that this stops you from calling startService/stopService yourself.
  2. There needs to be a way to reset the delay after an application-defined "success" event - the equivalent of "self.factory.resetDelay()" in a ReconnectingClientFactory-aware protocol.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 5 years ago by glyph

Replying to exarkun:

A couple considerations:

Services are great, but you don't always have an active application/service hierarchy when you want to do some reconnection. Not that this stops you from calling startService/stopService yourself.

Indeed not. And we should support these APIs a bit better. Maybe the reactor itself should implement IServiceCollection? That would be a neat way to propagate reactor parameters around, better than importing.

There needs to be a way to reset the delay after an application-defined "success" event - the equivalent of "self.factory.resetDelay()" in a ReconnectingClientFactory-aware protocol.

Yes. But, also, It seems like 'reset delay after successful connect' is a common enough use-case that it would be a good thing to be able to specify just as a boolean flag, so you don't need to write the callback hook yourself every time.

comment:5 Changed 4 years ago by <automation>

  • Owner glyph deleted

comment:3 follow-up: Changed 3 years ago by jerub

  • Priority changed from normal to highest

Current thinking is:

def nextDelay(now, last_success, last_failure):
    return delay_to_next_connection_attempt
pcs = PersistentClientService(endpoint, factory, reactor=reactor, nextDelay=nextDelay)
pcs.setServiceParent(rootservice)
pcs.connectedProtocol() # returns a deferred which will fire with a connected protocol.
client = MyClient(pcs) # MyClient may call pcs.connectedProtocol()
answer = client.do_something(argument) # call connectedProtocol.addCallback(lambda x:x.do_something(argument))

comment:4 follow-up: Changed 3 years ago by dreid

exarkun pointed out in an offline conversation that this probably isn't actually an IService I don't remember his reasons exactly but I can think of a couple.

  1. You don't always have a service hierarchy.
  2. You don't always have access to the service hierarchy when you know what you should connect to.

comment:5 in reply to: ↑ 4 Changed 3 years ago by glyph

Replying to dreid:

exarkun pointed out in an offline conversation that this probably isn't actually an IService.

I definitely believe this should be an IService. It's a thing that can be started and stopped. There's no reason to invent a new, incompatible interface for that. (c.f. the suffering caused by the fact that threadpools aren't IService providers.)

I don't remember his reasons exactly but I can think of a couple.

These are pretty crummy reasons.

  1. You don't always have a service hierarchy.

You often do have a service hierarchy. If you don't, get one. If you can't get one, just call startService and stopService whenever you feel like starting or stopping your particular client connection.

  1. You don't always have access to the service hierarchy when you know what you should connect to.

If you just want to hook up to a shutdown notification, there's always addSystemEventTrigger. But keeping a reference to the appropriate contact point in the appropriate service hierarchy is no more difficult than, say, passing a reference to the appropriate reactor to use.

comment:6 follow-up: Changed 3 years ago by exarkun

I'm somewhat convinced by the argument that it is a thing that can be started and stopped and that there is no reason to invent a new incompatible interface. However, then I can think of four reasons:

  • IService.setName
  • IService.setServiceParent
  • IService.disownServiceParent
  • IService.privilegedStartService

What do the implementations of these methods look like on the endpoint reconnecting IService implementation?

comment:7 in reply to: ↑ 6 Changed 3 years ago by glyph

Replying to exarkun:

I'm somewhat convinced by the argument that it is a thing that can be started and stopped and that there is no reason to invent a new incompatible interface. However, then I can think of four reasons:

  • IService.setName
  • IService.setServiceParent
  • IService.disownServiceParent
  • IService.privilegedStartService

What do the implementations of these methods look like on the endpoint reconnecting IService implementation?

Presumably these would be inherited from Service as is traditional for most IService implementors? I don't see why this one would be special.

comment:11 Changed 3 years ago by dreid

  • Author set to dreid
  • Branch set to branches/persistent-client-service-4735

(In [36948]) Branching to 'persistent-client-service-4735'

comment:12 Changed 2 years ago by acapnotic

  • Cc keturn@… added

comment:13 Changed 2 years ago by acapnotic

Okay, I wrote enough of this to be interesting, I think. Check it: https://github.com/keturn/twisted/compare/trunk...persistent-client-service-4735

Do you have a better way to intercept that connectionLost message than my gross monkeypatching?

It seems like the caller might want to know when connection attempts and failures happen. Should the interface include any way to get notified of those? Or should we throw in some logging? Or can the caller pretty much get that information by customizing their Protocol and Factory methods? (Maybe you'd see connectionMade and connectionLost with that, but not ConnectionErrors?)

comment:14 Changed 2 years ago by exarkun

Do you have a better way to intercept that connectionLost message than my gross monkeypatching?

With a wrapper - that way you have control over all events being sent to the protocol, since they're sent to your wrapper first.

It seems like the caller might want to know when connection attempts and failures happen.

The use case that seems most obvious to me is a way for the caller to know when a new connection has been established. This is not the connectedProtocol API, since that API only helps if the application uses it before each connection that is made.

Perhaps knowing when attempts are started and when failures happen is also important. This ticket is missing some use cases to help drive the specification.

Or can the caller pretty much get that information by customizing their Protocol and Factory methods?

They probably can, but this makes the API a lot less convenient.

comment:15 follow-up: Changed 2 years ago by acapnotic

Okay, using a wrapper:

https://github.com/keturn/twisted/commit/615ae5eefc3f90ccc92ec168023605fcbe053c0a

I wrote that because I think that monkey-patching someone else's object would be terribly surprising and hard to debug, so I'd like to avoid that by any means necessary. However, I still find these means here to be pretty gross, as users may end up wondering "Hey, why did the Endpoint get passed this thing that's not actually my Factory? And why does it not actually return my Protocol?"

Trying to slip in a proxy on their Protocol like this still leaves me uneasy. How complete does the proxy have to be? Is it enough to expose only the things in the relevant Interface? I'd say no, given that protocol.transport and .factory are de-facto standard but not part of IProtocol. Which means it needs setattr as well. Do we need to be worried enough about attribute collision to do extreme things like use double-underscore?

A wrapper like this will catch calls to connectionLost from the outside, and the call from the Connection should be the one that counts, but it won't catch the protocol calling self.connectionLost on itself. (Which it really oughtn't do, but could be a surprising discrepancy.)

In summary, I think that if we throw in setattr on here it'll work in practice, but it doesn't leave me feeling great about saying "Twisted Endpoints are a great abstraction for all your internet needs!" (but no, I haven't come up with a better idea.)

comment:16 in reply to: ↑ 15 Changed 2 years ago by glyph

Replying to acapnotic:

Okay, using a wrapper:

https://github.com/keturn/twisted/commit/615ae5eefc3f90ccc92ec168023605fcbe053c0a

Not to generalize prematurely, but we already have so many implementations of this (some within endpoints itself!) that it seems like it might be time to make it easier to create a protocol that you're delegating to.

I wrote that because I think that monkey-patching someone else's object would be terribly surprising and hard to debug, so I'd like to avoid that by any means necessary. However, I still find these means here to be pretty gross, as users may end up wondering "Hey, why did the Endpoint get passed this thing that's not actually my Factory? And why does it not actually return my Protocol?"

How would anyone get concerned about that? Isn't the whole point of this that the user of this API will leave calling connect up to the Service object? When you hand off a factory and endpoint to the PersistentClientService, you aren't allowed to have any particular expectations about what it will do with them beyond that it will do something to keep the protocol connected :).

The whole point of having a method like connectedProtocol is to allow it to expose the "right" object - which is not necessarily the result of endpoint.connect directly. You can un-wrap the wrapped result and return the wrapped protocol directly to clients of connectedProtocol.

Then nobody needs to see the wrapper - except perhaps in log messages. So maybe we need to bikeshed on the repr a little bit :).

Trying to slip in a proxy on their Protocol like this still leaves me uneasy. How complete does the proxy have to be?

It needs to be about as complete as the _WrappingProtocol that I linked to above. (Which means I think there ought to be some code sharing here.) There are other interfaces that the reactor might care about, related to junk like half-close, and we need to be careful to relay those reliably. But it doesn't need __setattr__.

A wrapper like this will catch calls to connectionLost from the outside, and the call from the Connection should be the one that counts, but it won't catch the protocol calling self.connectionLost on itself. (Which it really oughtn't do, but could be a surprising discrepancy.)

Do you have any example of code that actually does this for some reason? It's not clear to me what the expectations would be in that case, since if you called connectionLost on self, you'd already be having to deal with the duplicate notification from the transport when the connection was actually lost. If you tried to do something weird like taking the transport out of the reactor first, you'd still potentially leak a file descriptor or some other nasty side-effect.

comment:17 follow-up: Changed 2 years ago by acapnotic

When you hand off a factory and endpoint to the PersistentClientService, you aren't allowed to have any particular expectations about what it will do with them beyond that it will do something to keep the protocol connected :).

Ideally! But in my experience, when stuff breaks, you have cause to start tracing your call tree and your object graph, trying to figure out what's going on. When encountering things like the fact that dataReceived was not invoked on the Protocol you provided but on a _RestartableProtocolProxy, your wtfs-per-minute goes up and leads to statements like "that's why it's called Twisted."

More objects is more complexity, is all.

You can un-wrap the wrapped result and return the wrapped protocol directly to clients of connectedProtocol.

Oh, that's a good point, I hadn't thought of that. I do like the fact that would mean that the result of connectedProtocol is actually the object your ProtocolFactory returns. And that certainly reduces the need for proxying setattr.

(I was worried about the rather gauche practice Factory.buildProtocol indulges in, i.e. "protocol.factory = self", but since the user's factory does that to its own protocol, not the proxy, I think that's okay. And .transport is set through protocol.makeConnection, not as a direct attribute write.)

I can sort of imagine a situation somewhere where someone is doing introspection along the lines of filter(lambda connection: connection.protocol == myProtocol, connections), but I'm not actually sure if there's any such list of connections in a public API anywhere.

we already have so many implementations of this (some within endpoints itself!)

Yay code sharing! Also, apparently I get to stop worrying so much about whether this is a terrible thing to do, because it's already been done. (This does mean all these protocols will have wrappers from *both* the ClientEndpoint and the PersistentClientService, right?)

comment:18 in reply to: ↑ 17 Changed 2 years ago by glyph

Replying to acapnotic:

When you hand off a factory and endpoint to the PersistentClientService, you aren't allowed to have any particular expectations about what it will do with them beyond that it will do something to keep the protocol connected :).

Ideally! But in my experience, when stuff breaks, you have cause to start tracing your call tree and your object graph, trying to figure out what's going on. When encountering things like the fact that dataReceived was not invoked on the Protocol you provided but on a _RestartableProtocolProxy, your wtfs-per-minute goes up and leads to statements like "that's why it's called Twisted."

Certainly we should keep an eye towards this, but "oh, there's another object in the middle" is a much less (ahem) "twisted" thing (in the pejorative sense) to do than to start introducing magic dynamical method-replacement. Keep in mind, also, that (at least for now) these stack traces are entirely straightforward and go from the reactor through the wrapper into their code; no roundabout call-traces where Deferreds cut the stack in half.

More objects is more complexity, is all.

Not necessarily. Having more explicit objects can often be less complexity than, for example, additional inheritance or monkey-patching or other forms of indirection. In this case we have a nice, explicit delegation, with every object simply adhering to the interfaces in the spots where it's expected.

You can un-wrap the wrapped result and return the wrapped protocol directly to clients of connectedProtocol.

Oh, that's a good point, I hadn't thought of that. I do like the fact that would mean that the result of connectedProtocol is actually the object your ProtocolFactory returns. And that certainly reduces the need for proxying setattr.

Reducing is good; eliminating would be better :). Are there remaining use-cases for proxying setattr that you can think of?

(I was worried about the rather gauche practice Factory.buildProtocol indulges in, i.e. "protocol.factory = self", but since the user's factory does that to its own protocol, not the proxy, I think that's okay. And .transport is set through protocol.makeConnection, not as a direct attribute write.)

Exactly. And the proxy doesn't need to use Factory.buildProtocol at all.

I can sort of imagine a situation somewhere where someone is doing introspection along the lines of filter(lambda connection: connection.protocol == myProtocol, connections), but I'm not actually sure if there's any such list of connections in a public API anywhere.

This is more an argument for propagating adaptation through the wrapper rather than getattr/setattr. If we had such an API, the right way to expresss its intention would be to say "I would like a list of every connected protocol that provides IFoo, because I'd like to do IFoo-ish things to it". For people who don't like zope interface, this could be expressed with an abstract base class instead of an interface; but the general idea is that you shouldn't call some random internal-guts-inspection API and expect to get objects that do what you want out of it; you need to declare what kind of expectations you have of the result so the innards of the API can sensibly filter and convert things for you.

If the API is for monitoring or metrics, then IProtocol is all you need; if it's for something more application specific, then you have to say which application. You can't have it both ways ;-).

we already have so many implementations of this (some within endpoints itself!)

Yay code sharing! Also, apparently I get to stop worrying so much about whether this is a terrible thing to do, because it's already been done. (This does mean all these protocols will have wrappers from *both* the ClientEndpoint and the PersistentClientService, right?)

That's true, and that's another reason that I really want to make sure we're calling into the same code here. We should (A) figure out a nice API to make this wrapping public, since other folks might want to do it, and (B) add some optimizations to collapse multiple wrappers into a single one, so you don't have to pay for stuff you're not using. In other words, in order to make this fast, it's likely that it will need to get a lot grosser, and I want to centralize and very thoroughly test any grossness that arises from a legitimate performance need, with benchmarks and everything; we shouldn't go straight to ad-hoc attribute setting because it "might be slow" to call through several layers of objects. Who knows: maybe PyPy's JIT will give the extra layers of method calls to us for free if they're fairly static and part of a loop, so it might even be a waste of time to add said grossness.

comment:19 Changed 2 years ago by dreid

I'd propose that PersistentClientService should also accept a keyword argument whenConnected.

When whenConnected is not None it should be a single argument callable that will be called with the new protocol and that may return a deferred.

If it returns a deferred then subscribers should _not_ receive notification of the new protocol until that deferred fires.

Take for example the case of an IMAP4 backup utility.

It must:
1) connect
2) authenticate
3) select a mailbox
4) and then download messages in batches.

It would be easy to break down downloading messages in batches as.

  • get connected protocol
  • fetch N messages

Without a whenConnected notification and no public API to tell what the state of the protocol returned by connectedProtocol it is impossible to know if you have a previously connected protocol that is still viable, or a newly connected protocol that is in the incorrect state.

You could try your desired operation (such as fetch N messages) and handle failure as below:

  • get connected protocol
  • try fetch N messages and handle failure by:
    • getting connected protocol
    • authenticate
    • select mailbox
    • try fetch N messages again

However this means you'd always have to try the operation at least once before you could do your initial setup, or maintain your own explicit state machine.

It seems cleaner to specify the set of operations that are done explicitly at new connection time prior to allowing connectedProtocol to return the protocol instance.

Another use case for whenConnected is something like IMAP4 IDLE and other server-push protocols (MQTT, AMQP, etc…) where (as exarkun pointed out) your application may never have the need or opportunity to call connectedProtocol.

comment:20 Changed 2 years ago by glyph

Thanks for the thorough case-analysis, dreid.

comment:21 Changed 2 years ago by free.ekanayaka

Can't the use cases mentioned by dreid be handled by implementing appropriate connectionLost() and connectionMade() on your protocol class, maybe with some state information on the factory?

comment:21 Changed 2 years ago by free.ekanayaka

Can't the use cases mentioned by dreid be handled by implementing appropriate connectionLost() and connectionMade() on your protocol class, maybe with some state information on the factory?

comment:22 Changed 2 years ago by free.ekanayaka

  • Cc free@… added

comment:23 follow-up: Changed 2 years ago by glyph

Replying to free.ekanayaka:

Can't the use cases mentioned by dreid be handled by implementing appropriate connectionLost() and connectionMade() on your protocol class, maybe with some state information on the factory?

The problem is that it's a bit of a pain to stack and compose protocol objects, so the go-to way to override connectionLost on your own protocol is to simply inherit from the protocol you're trying to do something with. This is sub-optimal and it'd be great to discourage people from inheriting from protocols so much.

At least, I think that's what he's getting at; I'll let him speak for himself.

comment:24 in reply to: ↑ 23 ; follow-up: Changed 2 years ago by free.ekanayaka

Replying to glyph:

Replying to free.ekanayaka:

Can't the use cases mentioned by dreid be handled by implementing appropriate connectionLost() and connectionMade() on your protocol class, maybe with some state information on the factory?

The problem is that it's a bit of a pain to stack and compose protocol objects, so the go-to way to override connectionLost on your own protocol is to simply inherit from the protocol you're trying to do something with. This is sub-optimal and it'd be great to discourage people from inheriting from protocols so much.

At least, I think that's what he's getting at; I'll let him speak for himself.

Yeah I can see the problem, and have gone through this myself a few times. It seems that most use cases are around the ability of being notified of connectionFailed() and connectionLost() (or IHalfCloseableProtocol events).

Of course adding a whenConnected parameter to PersistentClientService would solve some use cases in practice, but I'm curious if you think a more general solution would be worth. From what I've read, I gather that such solution would along the lines of #4700? (BTW the point that acapnotic makes in that ticket about pub/sub vs stacking/composition is interesting).

comment:25 in reply to: ↑ 24 Changed 2 years ago by dreid

Replying to free.ekanayaka:

Replying to glyph:

Replying to free.ekanayaka:

Can't the use cases mentioned by dreid be handled by implementing appropriate connectionLost() and connectionMade() on your protocol class, maybe with some state information on the factory?

The problem is that it's a bit of a pain to stack and compose protocol objects, so the go-to way to override connectionLost on your own protocol is to simply inherit from the protocol you're trying to do something with. This is sub-optimal and it'd be great to discourage people from inheriting from protocols so much.

At least, I think that's what he's getting at; I'll let him speak for himself.

Yeah I can see the problem, and have gone through this myself a few times. It seems that most use cases are around the ability of being notified of connectionFailed() and connectionLost() (or IHalfCloseableProtocol events).

Of course adding a whenConnected parameter to PersistentClientService would solve some use cases in practice, but I'm curious if you think a more general solution would be worth. From what I've read, I gather that such solution would along the lines of #4700? (BTW the point that acapnotic makes in that ticket about pub/sub vs stacking/composition is interesting).

The key feature of whenConnected (and something which is difficult to achieve with connectionMade) is that it may return a Deferred and in doing so "block" access to the connected protocol resource until it has finished doing all of the necessary setup.

comment:26 Changed 2 years ago by free.ekanayaka

Thanks for explaining. I had missed the relationship between the deferred returned by whenConnected() and the consequent "blocking" behavior with respect to connectedProtocol(). I can know see how it covers very well the typical use cases (including server-push protocols, as pointed).

comment:27 Changed 2 years ago by free.ekanayaka

  • Cc free@… removed

comment:28 Changed 2 years ago by free.ekanayaka

  • Cc free@… added

comment:29 Changed 2 years ago by ralphm

  • Cc twistedmatrix.com@… added

I am pretty interested in this effort. Basically all XMPP client code is built on top of ReconnectingClientFactory as twisted.words.xish.xmlstream.XmlStreamFactory derives from it. This has been a sore point for quite a while now, and the move to endpoints makes that even more obvious. It would also make it a lot easier to do replace SRVConnector with a SRV-aware endpoint.

I like dreid's suggestion of whenConnected, as it might replace the current practice of adding an observer for STREAM_AUTHD_EVENT to do something similar.

acapnotic, what are your plans with this branch?

PS. I noticed that there's a typo in the actual class name.

comment:31 Changed 2 years ago by tomprince

  • Author changed from dreid to dreid, tomprince
  • Branch changed from branches/persistent-client-service-4735 to branches/persistent-client-service-4735-2

(In [38680]) Branching to persistent-client-service-4735-2.

comment:32 Changed 2 years ago by hodgestar

  • Author changed from dreid, tomprince to hodgestar, dreid, tomprince
  • Branch changed from branches/persistent-client-service-4735-2 to branches/persistent-client-service-4735-3

(In [38687]) Create branch for alternative implementation of kreturn's solution to #4735 that includes features from ReconnectingClientFactory.

comment:33 Changed 2 years ago by hodgestar

I added a branch (branches/persistent-client-service-4735-3) which currently includes an implementation and tests. I plan to document what's there in the next couple of days (and maybe tweak the parameters to the ReconnectingClientService constructor) but feedback on the implementation as it is now would already be useful.

Note that the implementation doesn't include whenConnected or connectedProtocol functions. If people are happy with the rest I can add those but I'm not entirely convinced those are part of what ReconnectingClientFactory used to provide (although they're definitely useful).

comment:34 Changed 2 years ago by hodgestar

  • Cc hodgestar added
  • Keywords review added
  • Milestone set to regular-releases
  • Priority changed from highest to high

comment:35 Changed 2 years ago by glyph

  • Milestone regular-releases deleted

The 'regular releases' milestone is for release management tools, not Twisted's code itself.

comment:36 Changed 2 years ago by tom.prince

Not a real review, but a couple of quick comments:

  1. _RestartableProtocolProxy: This should probably use proxyForInterface. (same for the other proxy)
    • I'm somewhat unhappy with the pair of proxies, but I guess there is already #4700, for dealing with that. Although, the factory proxy could perhaps be
      Factory.forProtocol(lambda: _RestartableProtocolProxy(protocolFactory.buildProtocol(), clientService))
      
      On the other hand, objects are good.
  2. Do clientConnect* need to be public? They seem to be implementation details.

comment:37 Changed 2 years ago by hynek

  • Cc hs@… added

comment:38 Changed 2 years ago by hodgestar

@Glyph: Thanks for fixing the milestones. Regarding the questions:

  1. I'm a little worried that doing something more constrained will make things harder for people who have written bad factory / protocol pairs that rely on calling methods that aren't part of the interfaces (this is particularly easy because factories and protocols come in pairs). If you still think it'd be better to use Factory.forProtocol and proxyForInterface I can make the change though.
  1. I tried making the clientConnect* methods private but that left me stubbing out lots of underscore methods in tests and I'm worried that people writing libraries on top of Twisted will end up doing the same or similar.

comment:39 Changed 2 years ago by tom.prince

Some more quick comments.

  1. It is discussed earlier in the ticket somewhat, but the wrapped protocols should really be exposed to the user anyway. The client protocol will get a factory attribute from the client factory, so that isn't an issue. The line of code I included is perhaps a bit too clever, though (one issue with it, is that it doesn't forward doStop/doStop).
  2. People using this should need to test the behavior, so I'm not sure why people would need to mock them. In any case, it is easy to make things public, but harder to make the private, so my inclination is to make things private by default.
  1. I've only briefly glanced through the tests, but
    1. I noticed many of the methods lack docstrings (see #6535 for some hints on what to put there).
    2. I'd be inclined to try to avoid mocks, if possible.
    3. A lot of the helpers used look generic, or close to it.
      • DummyProtocol can just be Protocol
      • DummyTransport can probably just be StringTransport
      • I don't know if there is a LogCatcher, but there should be.
        • start/stop are the conventional names.
        • It would be better to make assertions about the event dict, rather than squashing things to text (especially in light of the recent discussions about structured logging)
      • I'm not sure if there is an existing fake endpoint, but using a real endpoint with MemoryReactor might be appropriate.
    4. camelCase

comment:40 Changed 2 years ago by hodgestar

At this point what I'd really like is to get some sort of +1 on the general structure of the solution or not so I know whether it's likely to land ever.

Is #6535 the right ticket? Struggling to see how it relates to test docstrings.

I don't think Twisted has anything like LogCatcher yet, although it is very useful.

Assuming I get some sort of "yes, this approach is cool and should go in" from some, I can hop through all the formatting and stylistic issues when I get a chance.

comment:41 Changed 2 years ago by tom.prince

At this point what I'd really like is to get some sort of +1 on the general structure of the solution or not so I know whether it's likely to land ever.

Yes. This looks like a reasonable approach.

comment:42 Changed 2 years ago by exarkun

Thank you for updating the howto, but I'd still like to see some examples before declaring that this is the correct API. Lots of use cases have been discussed, has anyone demonstrated that they are satisfied by this API? The code in the howto doesn't really help, since it doesn't actually implement anything in particular.

comment:43 Changed 2 years ago by hodgestar

For me the import use case is really just to have a something that will repeatedly create protocol instances whenever a connection is established. The only methods relevant to this use case are init, startService and stopService. See, e.g., the three lines starting at:

https://github.com/praekelt/vumi/blob/develop/vumi/transports/smpp/transport.py#L178

and the line

https://github.com/praekelt/vumi/blob/develop/vumi/transports/smpp/transport.py#L185

Maybe I should make all the other methods start with underscore?

Or should I push in the opposite direction and add some features from dreid and kreturn's suggestions like whenConnected and connectedProtocol? I'm not claiming any special insight into their use cases though. :)

comment:44 Changed 2 years ago by exarkun

Sorry, to be clear, I think the branch for this ticket needs to include more documentation, including examples. If you could reduce the vumi code to a manageable piece of documentation, that would be wonderful. I don't expect you to write any examples for anyone else's use-cases, they can do that.

comment:45 Changed 2 years ago by tom.prince

Or should I push in the opposite direction and add some features from dreid and kreturn's suggestions like whenConnected and connectedProtocol?

This seems like something that could be added later, so should be done in a separate ticket.

comment:46 Changed 2 years ago by tom.prince

This isn't a complete review, but I'm done for the day, so I'll leave this here and pick it up again in the morning:

  1. _RestartableProtocolProxy:
    1. As mentioned on #4700, this should proxy interfaces implemented by the protocol it is proxying for. At least IHalfCloseableProtocol and IFileDescriptorReceiver.
    2. (optional) It seems __repr__ typically doesn't seem to typically include the module.
    3. The __repr__ should probably be a noun phrase, rather than a sentnce (i.e "wrapping" instead of "wraps").
  2. ReconnectingClientService:
    1. logging: There was a long dicussion about logging recently.
      1. noisy: There are a number of places in twisted that do use .noisy, but we don't really want to add more.
      2. Log messages should pass a format string and keyword arguments:
        log.msg(format="Abandoing %(endpoint)s after %(retries)s retries.",
                endpoint=self.endpoint, retries=self.retries)
        
        I would also be inclined to include the factory in keywords (and possibly the message?).
    2. The docstring says that clients should call resetDelay, but the servcie does it when it connects with a comment about allowing a hook for the protocol itself to do it.
      • It seems like one or the other should happen.
    3. Any todos that remain when this is merged will need to reference a ticket.
    4. The clientConnection* methods should
    5. What happens if retry is called by user code? It seems likely that it will mess up the internal state.
      • Either it should handle being called randomly, or it should be private. (Probably the later)
  1. ReconnectingClientServiceTestCase:
    1. Instead of patching the class everywhere, you should be able to just override the method on the instance you have.
    2. Method names should be camelCase.

comment:47 Changed 2 years ago by tom.prince

  • Owner set to hodgestar
  1. Tests:
    1. Instead of patching the class everywhere, you should be able to just override the method on the instance you have.
      • This perhaps means that you can just assing a MockRecorder to the attribute, rather than wrapping that in a function?
    2. I also notice a number of attributes that are under_score_seperated. Please change them.
    3. LogCatcher currently looks at the message key of the event dicts. If it wanted to look at just the message, it should use log.textFromEventDict. But, really, it should make assertions about the dictionary itself.
    4. All your test methods should have docstrings. (See #6301)
    5. In test_stopServiceWhileConnecting: You probably want to use self.failureResultOf(s._connectingDeferred, CancelledError)
    6. DummyTransport can probably be replaced by twisted.test.proto_helpers.StringTransport, and DummyProtocol by Protocol.
    7. There are no tests for the proxies. I realized this when I noticed that all the test that might interact with them mock them out. I'm also somewhat sceptical of all the mocking, but I don't have any immediate suggestions. Perhaps writing test docstrings will make things clearer.
      • One thing that occurs to me, as a possibility, is factoring out the calculation of the delay from retry.
  2. documentation
    1. ReconnectingClientService doesn't accept ClientFactorys, it accepts IFactoryFactorys. That is, the additional methods of ClientFactory don't get called. In fact, I think this whole paragraph can be removed, since ReconnectingClientFactory is discussed elsewhere in the documentation.
    2. The clients howto also talks about reconnecting clients, and specficaly mentions that reconnecing is supported by endpoints. That howto should be updated. (Note: This means adding documentation about the new code, but leaving the old code (perhaps with a comment about it being the old way). See the discussion of connectProtocol vs. reactor.connect* there for an example.

comment:48 Changed 2 years ago by mithrandi

  • Cc mithrandi@… added

comment:49 Changed 23 months ago by rwall

  • Owner changed from hodgestar to tom.prince

comment:50 Changed 23 months ago by tom.prince

  • Keywords review removed

comment:51 Changed 23 months ago by tom.prince

  • Owner changed from tom.prince to hodgestar

comment:53 Changed 12 months ago by hodgestar

I moved my patch to https://github.com/twisted/twisted/pull/40 and updated it to apply cleanly on trunk.

comment:54 Changed 12 months ago by hodgestar

Fair amount of updating done:

  1. _RestartableProtocolProxy:
    • All addressed, I think.
  1. ReconnectingClientService:
    • Mostly addressed but,
    • Could someone suggest what noisy should be replaced with please.
    • I'll handle TODOs once its clear what should happen with noisy.
    • I didn't understand what "The clientConnection* methods should" means.
  1. Tests:
    • Mostly addressed. Would appreciate some feedback on what else remains to be cleaned up.
    • I worked on #6677 a bit to replace LogCatcher with. If that looks like it'll land I'll update this ticket.
    • Note: The proxies aren't stubbed out (only the underlying protocol and factory).
  1. Documentation:
    • Will update once it looks like we're happy with the rest of this.

See https://github.com/twisted/twisted/pull/40 for code.

Last edited 12 months ago by hodgestar (previous) (diff)

comment:55 Changed 12 months ago by glyph

Thanks hodgestar.

I've closed the pull request but please don't interpret that as a rejection of the change. I would just prefer it if people did not use the pull request UI until we actually have some way of integrating with github's workflow.

comment:56 Changed 7 months ago by hynek

It seems the review keyword hasn’t been set and this ticket went into hibernation because of that? I understand it’s ready for review?

comment:57 Changed 7 months ago by glyph

  • Keywords review added

Seems that's what hodgestar's last comment implies.

Let's see what happens.

comment:58 Changed 6 months ago by glyph

  • Author changed from hodgestar, dreid, tomprince to hodgestar, dreid, tomprince, glyph
  • Branch changed from branches/persistent-client-service-4735-3 to branches/persistent-client-service-4735-4

(In [43701]) Branching to persistent-client-service-4735-4.

comment:59 Changed 6 months ago by glyph

  • Keywords review removed

Thanks for continuing this, Hodgestar. Sorry this ticket got stuck for a while.

I committed it to a proper Twisted branch again, forced a build, and things look like they're in okay, but not quite adequate shape.

Here's some feedback that needs addressing:

  1. When you wrote this code, I think twistedchecker was in considerably worse shape. Sadly for you, it now works well enough that we actually require all new code to be warnings-clean ;). you can see all the new twistedchecker warnings here.
  2. The jokes with the constants were funny at the time ReconnectingClientFactory was created, but not any more.
  3. All new classes should be new-style classes. Service, unfortunately, is old-style. You should inherit from object as the last thing in your bases list when you create new classes that inherit from legacy old-style Twisted stuff.
  4. The new test mocks are probably mostly unnecessary.
  5. I guess not all of Tom's feedback actually got dealt with? Particularly I notice the docs still make reference to ClientFactory.
  6. I would get rid of the NIST joke. Especially since Dual_EC_DRBG sort of trashed NIST's reputation, it's not as funny. It's extra weird to have the joke in a comment and the thing saying "don't pay attention to the joke" actually be in the public docstring. Basically: it's a new decade, get a new joke :). (Also, I think on today's internet, much smaller values would make sense for defaults?)
  7. "noisy" is not a good flag to have. It's a design error just about everywhere it appears in Twisted. Just always log the failures. We can address log noisiness in a more sensible way when #6750 lands again. (It allows for arbitrary log filtering.)
  8. The test_noisy versions of tests just repeat the exact same docstring. If you do want to have some kind log-control feature or parameter, the tests should say what they're testing, which is to say, the log messages. (And they could probably repeat less code by calling the existing tests.)
  9. DummyProtocol is entirely unnecessary; just use Protocol.
  10. The decision not to use mocks in Twisted is a conscious one. We regard auto-mocks as a testing antipattern. Please remove MockRecorder. Luckily, the things that you are doing with it are easily supported by the testing strategy that Twisted does pursue: just use a fake endpoint and a task.Clock and you can easily verify that the right things are happening; task.Clock has a getDelayedCalls method, for example,
  11. LogCatcher._gather_logs is an unnecessary method. You can just use "self.logs.append" as a bound method as the observer.
  12. Rather than making your own DummyTransport, why not use the available mock in twisted.test.proto_helpers, StringTransport?
  13. There's a StringEndpoint in twisted.web.test.test_agent.test_getUsesQuiescentCallback which we could probably refactor into something in proto_helpers as well, rather than having ClientTestEndpoint be a duplicate of that. Alternately, you could do what most of the other tests in twisted.web.client do, and just construct a "real" endpoint wrapped around a MemoryReactor.
  14. Rather than setting a special attribute after the fact, test_parametrizedClock should probably just pass the parameter to ReconnectingClientFactory's constructor. Actually passing it is a bit awkward in the command-line service instantiation code, so it still needs to be a default parameter, probably, but at least we can make it look as normal as possible.
    1. also on that test, it's "parameterized" ;-)
  15. We don't use double-underscore private variables; we decided a long time ago the name mangling is more trouble than it's worth. A single underscore will do for _RestartableProtocolProxy.__protocol and .__clientService.
  16. Pretty much all the attributes on ReconnectingClientService should be private; this much new public API surface is unnecessary, especially given that changing the attributes won't necessarily take effect right away and there's no way to communicate that clearly to the caller.
  17. I think for similar reasons retry should be private; it seems like an outside caller initiating a retry would corrupt the internal state here.

Other observations:

  1. Nice job on the structured log messages. Since I mentioned #6750 I should mention that it's too bad that the format is going to change - but retrofitting this code is going to be a lot easier than fixing most of the rest of Twisted to log sensibly :).
  2. As Jean-Paul pointed out, maybe having this be a Service isn't the greatest idea. I mean: there should be a service that does this, but there would be less coupling overall if there were simply a ReconnectingClient object that just had simple start and stop methods, and then a Service subclass that wrapped said methods, and added all the subclass-of-Service junk like setName et. al.. This would also introduce another interesting axis of customization: I imagine that many users of this object would actually prefer that, when stopped, it simply call abortConnection rather than waiting for all connections to stop. This could be represented in the "simple object" layer as a separate waitForConnectionsToExit method, rather than coupling that to the Service.
  3. Naming: Reconnecting certainly describes what this is doing, but PersistentClientConnectionService more describes what it is. AlwaysConnectedClient, maybe? Just a thought.

Oof. Sorry for the huge review; I think this looks like a lot more feedback than it actually is, and I'm hopeful that you can resubmit this quickly.

Thanks for sticking with it!

comment:60 Changed 6 months ago by glyph

Oops, forgot to reassign!

comment:61 Changed 4 months ago by adiroiban

  • Cc adi@… added

comment:62 Changed 4 months ago by TheAnarcat

is it me or are there conflict markers in the diff here?

https://twistedmatrix.com/~diffresource.twistd/4735

i would assume this isn't something one would like in a patch... ;)

comment:63 Changed 4 months ago by TheAnarcat

it looks like there are merge conflicts in that patch on github, but the patch actually rebases fine on twisted 15, so disregard my last comment.

i am able to use it as well, it's pretty neat. whereas i used to do:

        endpoints.clientFromString(reactor, s).endpoint.connect(factory)

I now do:

        endpoint = endpoints.clientFromString(reactor, s)
        defer = endpoint.connect(factory)
        if ReconnectingClientService is None:
            logging.warning('ReconnectingClientService is not available, connexion failures will require a restart, see README')
        else:
            ReconnectingClientService(endpoint, factory, noisy=True).startService()

Note the backwards-compatibility glue... It does feel a little weird to start playing with services whereas I didn't have to do that before, but it beats what was there before (which was nothing, and I couldn't figure out a way to make it work with factories in #7840).

I like this, thanks! I hope this makes it in Twisted 16.

Note: See TracTickets for help on using tickets.