Opened 4 years ago

Closed 14 months ago

Last modified 14 months ago

#4859 enhancement closed fixed (fixed)

client endpoint: super-smart name-based TCP connection algorithm

Reported by: glyph Owned by: ashfall
Priority: normal Milestone:
Component: core Keywords: endpoint ipv6
Cc: twistedmatrix.com@… Branch: branches/gai-endpoint-4859-6
(diff, github, buildbot, log)
Author: ashfall Launchpad Bug:

Description (last modified by glyph)

The Goal

The user wants to enter the name of a particular host, and connect as quickly as
possible. They may also want to enter a port number or service name.

The application developer wants to let the user do that, and just use a
simple-to-construct endpoint to do all the work involved in that.

The Problems

Name resolution and routing are not always sensibly connected. In particular,
it is very common for networks to automatically configure their
clients with local IPv6 addresses and happily resolve remote IPv6 addresses, but be
misconfigured in such a way as to not route IPv6 past the border gateway. It
isn't even that unusual for the network, or a particular host on it, to publish
an internal IPv6 address that, for whatever reason, won't even respond to IPv6
locally.

The fact that it doesn't route IPv6 at all means that you don't get any feedback
that your connection attempt isn't working besides the eventual timeout from
your first SYN packet.

Of course, IPv6 isn't the only reason a network or nameserver may be
misconfigured in this way. Un-connectable hosts happen all the time; it's just
that this is a particularly common problem that one hits when talking about
switching from a naive IPv4 configuration connection to a more sophisticated
multi-address-family approach. Really though, even if you're doing IPv4
correctly, you'll hit it sometimes.

The Solution

We should follow the relevant specification
and resolve all possible connectable addresses under the given host name /
service name combination using getaddrinfo. (While we should not rule out a
truly asynchronous version of getaddrinfo, this involves trying to parse a lot
of platform-specific policy and it would be best to keep that work separate.)

Then, as said specification suggests, we should attempt to connect to them in
the order in which they are returned, as that is the preferred order. However,
as some addresses may not respond promptly enough, we should initiate several
attempts in parallel.

If everything's working properly, the first attempt will complete quickly and we
won't even make the second one. If there's a little bit of lag, the first
attempt should still have an advantage over the second by virtue of the fact
that it initiated faster and lag should affect them equally, it'll complete
first, and we will cancel the second one.

In the case that one or more of the addresses is going to time out for some
reason, the user won't have to wait for every one to time out in turn; they'll
be timing out in parallel.

In order to conserve resources, and to avoid bugs where user code gets invoked
twice, once one connection attempt has succeded, we should cancel all the
outstanding ones.

It would be useful to represent this internally as one unit which converts the
hostname/service pair into a list of endpoints, and then a separate unit which
implements connecting in parallel to a list of endpoints. It may be useful in
the future to expand the name-resolution portion of this to generate endpoints
which do something custom (for example: resolve "hostnames" by looking at an
OpenSSH format ssh_config file with Host lines in it, then doing the process
recursively to resolve the real underlying hostnames and using conch to
actually connect).

Change History (33)

comment:1 Changed 4 years ago by philmayers

Circumventing or re-implementing getaddrinfo is the wrong choice.

Systems permit (are mandated by RFCs, in fact) a site-local policy reordering of the address tuples returned from getaddrinfo - see for example "man gai.conf" on a Linux system. The default sorting policy is mandated in the RFC, and may change in future and (presumably) be retrofitted by system patches. Sites may have deployed by policy changes to this. If you attempt to circumvent getaddrinfo, you will destroy this. At the very least, Twisted applications will have behavior inconsistent with the rest of the system.

Possibly the RFC you are referring to is RFC 3484, which defines address selection rules.

I understand the desire to implement clever things like SRV lookups, but please - don't get clever here. Honour the standards and do things the right way. If you have an interface that takes hostnames, use getaddrinfo.

I don't know what systems getaddrinfo is "lame" on. Perhaps you could go into more detail?

comment:2 Changed 4 years ago by jknight

The order in which you're supposed to try connecting is the order they're returned by getaddrinfo, because it sorts them by various criteria (like being on your local network).

It might be interesting to not wait for the entire connect() timeout before trying another host, but certainly trying all in parallel would be a bad idea. It would be a very not nice thing to do to the servers.

And yea, just use getaddrinfo, by default. SRV lookup needs to be a separate API or at least a default-off flag, because you're not allowed to use SRV records for most protocols (but of course are required to use it for a few).

comment:3 Changed 4 years ago by exarkun

Some discussion on dns-operations included a suggestion to try them in staggered parallel. Wait ~RTT before trying the next address, but then try it even if the previous attempt hasn't failed yet. (Calculating the RTT for any arbitrary address is left as an exercise for the reader.)

comment:4 Changed 4 years ago by <automation>

comment:5 Changed 3 years ago by jknight

  • Keywords ipv6 added

comment:6 Changed 3 years ago by ragzilla

Is this a duplicate of #4362 ? All the suggestions here (except the connection behavior changes) are related to resolver API, and the connection behavior changes would perhaps better be addressed in a ticket specifically for implementing behavior documented in draft-ietf-v6ops-happy-eyeballs-07

comment:7 Changed 3 years ago by exarkun

The ticket description does seem to mix two related but orthogonal things together.

#4362 will provide the mechanism for doing getaddrinfo lookups. This ticket can use the APIs implemented for that ticket to implement an actual client endpoint to do the connection.

The happy-eyeballs draft seems to largely cover what this ticket is trying to accomplish, leaving all of the address resolution to a getaddrinfo implementation. So perhaps this is the ticket for implementing that behavior.

comment:8 Changed 3 years ago by glyph

I kinda want to do this first; specifically, I'd like to implement it in advance of #4362. I think that we can have something that just does callInThread(getaddrinfo, ...) internal to the endpoint, and then later, smoothly upgrade to the version that uses a smarter, fancier, reactor name-resolution interface. This could trivially even be made compatible with older third-party reactors by doing IReactorReallyFancyResolver.providedBy(self.reactor).

Thoughts?

comment:9 Changed 2 years ago by glyph

I implemented this for Calendar Server, although it is definitely missing some test coverage and probably missing a bunch of necessary knobs and customizations. It may be useful as an initial point of reference, however.

comment:10 Changed 2 years ago by ashfall

  • Owner set to ashfall

comment:11 Changed 2 years ago by ralphm

  • Cc twistedmatrix.com@… added

comment:12 follow-up: Changed 2 years ago by jknight

A delay of 0 sec between connection attempts (like the above patch uses) is not great.

Think of people doing traditional DNS round-robin.

The delay can be a tunable number, but should default to something less aggressive. Either infinity (wait for each connect to timeout before trying next one), or at the smallest, something large enough that you could expect most reasonable networks to have connected by then. Say, 1 second.

comment:13 in reply to: ↑ 12 ; follow-up: Changed 2 years ago by glyph

Replying to jknight:

A delay of 0 sec between connection attempts (like the above patch uses) is not great.

Oh wow, yes, that is terrible. I think I had changed that for debugging and I just never set it back. Thanks for pointing it out.

Think of people doing traditional DNS round-robin.

The delay can be a tunable number, but should default to something less aggressive. Either infinity (wait for each connect to timeout before trying next one), or at the smallest, something large enough that you could expect most reasonable networks to have connected by then. Say, 1 second.

99.9% agreement.

The only thing I would add is that I wish I could find some kind of official document explaining that 1 second is the One True Value for this, or explaining some sort of backoff algorithm, along with a rationale for why it works well and doesn't promote congestion and bufferbloat and all the ills of the world.

comment:14 in reply to: ↑ 13 Changed 2 years ago by glyph

Replying to glyph:

The only thing I would add is that I wish I could find some kind of official document explaining that 1 second is the One True Value for this, or explaining some sort of backoff algorithm, along with a rationale for why it works well and doesn't promote congestion and bufferbloat and all the ills of the world.

It looks like the draft mentioned above is now an RFC: RFC 6555, "Happy Eyeballs: Success with Dual-Stack Hosts". I haven't had time to thoroughly evaluate it but it sounds authoritative, if not necessarily useful :).

comment:15 Changed 2 years ago by glyph

  • Description modified (diff)

Updated the description with a more expansive spec. Hopefully this will be useful to ashfall. Others should feel free to expand upon it.

comment:16 Changed 2 years ago by ashfall

  • Author set to ashfall
  • Branch set to branches/gai-endpoint-4859

(In [35028]) Branching to 'gai-endpoint-4859'

comment:17 Changed 2 years ago by ashfall

  • Branch changed from branches/gai-endpoint-4859 to branches/gai-endpoint-4859-2

(In [35048]) Branching to 'gai-endpoint-4859-2'

comment:18 Changed 20 months ago by ashfall

  • Branch changed from branches/gai-endpoint-4859-2 to branches/gai-endpoint-4859-3

(In [37360]) Branching to 'gai-endpoint-4859-3'

comment:19 Changed 20 months ago by ashfall

  • Keywords review added
  • Owner ashfall deleted

So the branch passes most builds, and I think is ready for a review. Build Results

A few points to consider while reviewing:

  • In HostnameEndpoint.connect._endpoint, what should be the default return value when an address is neither IPv4 nor IPv6?
  • The names of the various attributes of HostnameEndpoint, should you have a better naming suggestion.
  • Does HostnameEndpoint go in __all3__? (Note that python-3.3-tests fail, not sure how to fix that.)
  • twisted.internet.test.test_endpoints is now importing MemoryReactorWithConnectorsAndTime as MemoryReactor, because HostnameEndpoint requires clock and schedules timed calls.

comment:20 follow-up: Changed 20 months ago by tom.prince

  • Keywords review removed
  • Owner set to ashfall

(Note that python-3.3-tests fail, not sure how to fix that.)

You fix that by moving the object to the end of the base class list. (It should always be at the end)

In HostnameEndpoint.connect._endpoint, what should be the default return value when an address is neither IPv4 nor IPv6?

I don't think any current implementation of getaddrinfo returns anything else, and if it did, we don't know what to do with it, so we can just ignore it.

  1. getaddrinfo
    1. An endpoint only supports connecting to stream transports, not datagram transports, so we should tell getaddrinfo that.
    2. getaddrinfo will resolve service names, so instead of passing port directly to to connectTCP, we should pass it to getaddrinfo and use the resolved port when trying to connect.
  2. Instead of
    receivedFailures = []
    d.addErrback(receivedFailures.append)
    ...
    self.assertEqual(len(receivedFailures), 1)
    failure = receivedFailures[0]
    
    you can instead just do
    ...
    failure = self.failureResultOf(d)
    
  3. You don't need to duplicate test_nameResolution. The second one isn't exercising any more code.
  4. The DNSLookupError should contain the hostname (and given 1.2, the service name).
  5. HostnameEndpoint.__init__:
    1. host doesn't need to be something that has an IPv6 address. The point of this is to have a way of connecting that doesn't care about address families. Something like "host name to connect to"? If you want to be more detailed, it needs to be something suitable to pass to getaddrinfo.
    2. Given 1.2, there should be a service instead of port argument, which will need documentation.
    3. timeout probably needs to be documented here, since it is the timeout for each individual connection, rather than some total timeout.
  6. (optional) Is there any reason that _canceller is method rather than a local function in connect like all the other callbacks? (In particular, _pending could then be made local to connect as well)
  7. (minor) almostDone would be better named maybeDone or checkDone.

Please resubmit for review after addressing the above points.

comment:21 Changed 20 months ago by tom.prince

(Also, there is a pyflakes error for implements)

comment:22 in reply to: ↑ 20 ; follow-up: Changed 18 months ago by ashfall

  • Keywords review added
  • Owner ashfall deleted

Thanks for the review! New build results

Addressed all points, except 5.2.

Replying to tom.prince:

  1. Given 1.2, there should be a service instead of port argument, which will need documentation.

Following a discussion on IRC:

<exarkun> I think the review comment is out of scope, the endpoint should be added with integer port support, and someone can figure out a coherent specification for an additional feature in a new followup ticket.

I'll file the said followup ticket once this is ready to merge. Let me know if you feel otherwise.

comment:23 in reply to: ↑ 22 Changed 18 months ago by glyph

Replying to ashfall:

Thanks for the review! New build results

Addressed all points, except 5.2.

Replying to tom.prince:

  1. Given 1.2, there should be a service instead of port argument, which will need documentation.

Following a discussion on IRC:

<exarkun> I think the review comment is out of scope, the endpoint should be added with integer port support, and someone can figure out a coherent specification for an additional feature in a new followup ticket.

Just wanted to apologize for piling on in a way which suggested I thought this needed to be implemented on IRC: exarkun is totally right here, we can add service-name resolution later. (And since, as discussed above, getaddrinfo does not do more interesting stuff like SRV lookups, we will need to add more features later anyway.)

I'll file the said followup ticket once this is ready to merge. Let me know if you feel otherwise.

This sounds like the right decision to me as well.

comment:24 Changed 18 months ago by exarkun

  • Owner set to exarkun
  • Status changed from new to assigned

comment:25 Changed 18 months ago by ashfall

  • Branch changed from branches/gai-endpoint-4859-3 to branches/gai-endpoint-4859-4

(In [38389]) Branching to 'gai-endpoint-4859-4'

comment:26 follow-up: Changed 17 months ago by exarkun

  • Keywords review removed
  • Owner changed from exarkun to ashfall
  • Status changed from assigned to new

Thanks! Sorry about the long review delay.

  1. twisted/internet/address.py
    1. HostnameAddress needs to decide if its hostname attribute is a byte string or a text string. New code mostly needs to avoid str and "" and instead go with either bytes and b"" or unicode and u"". I suspect hostname is a byte string, but I haven't thought about it too hard yet.
    2. Also, you can and should use L{} to refer to built-in types now, rather than C{}
    3. compareAttributes is missing some whitespace between its elements
    4. HostnameAddress appears in __all3__ but its tests haven't been added to admin/_twistedpython3.py so they won't run on Python 3. This isn't allowed, the tests must run if we're claiming the code is ported.
  2. twisted/internet/test/test_address.py
    1. Once you decide on a type for HostnameEndpoint.hostname, don't forget to update the string literals in HostnameAddressTestCase.
    2. Also, FWIW, I like the naming convention HostnameAddressTests rather than HostnameAddressTestCase. But Twisted has no official standard. :)
  3. twisted/internet/endpoints.py
    1. Please move the new socket import up to the stdlib import section.
    2. HostnameEndpoint has the same decision to make about its host parameter (and must come to the same conclusion, of course)
    3. @type for host in HostnameEndpoint.__init__ could use some L{} markup
    4. It's nice to indent second and subsequent lines of a @param item - the timeout markup currently has an unindented second line.
    5. Sadly, the definition/use of HostnameEndpoint._deferToThread breaks HostnameEndpoint for real world usage. The tests are happy because they patch a HostnameEndpoint *instance*. However, when HostnameEndpoint is used without having this attribute patched in that way, the free-function gets wrapped up in a bound method and gets the automatic first self argument passed to it. This fails of course, since deferToThread does not need or want a self argument there. This is easily fixed in a number of ways - for example, staticmethod(deferToThread) inhibits the creation of a bound method. The tricky part is writing a unit test to prove it. :)
  4. twisted/internet/test/test_endpoints.py
    1. MemoryReactorWithConnectorsAndTime no longer exists. I guess you might want twisted.test.proto_helpers.MemoryReactorClock now.
    2. RaisingMemoryReactorWithClock should call RaisingMemoryReactor.__init__ instead of setting those private attributes itself.
    3. HostnameEndpointsOneIPv4TestCase.createClientEndpoint needs a docstring
    4. HostnameEndpointsOneIPv6TestCase.createClientEndpoint` needs one too
  5. twisted/internet/endpoints.py
    1. There is no test coverage for the not pending condition in checkDone
    2. There is no test coverage for the not successful condition in checkDone
    3. There is no test coverage for address families other than AF_INET and AF_INET6 coming back from getaddrinfo in _endpoints (something that probably won't happen very often in practice, but it should be easy to test and skipping such things is somewhat important behavior)
    4. There's no test coverage for lc.running being False in afterConnectionAttempt
    5. There's no test coverage for handling of an exception out of the lines between 774 and 778. (I suspect no relevant exceptions can come from those lines, so there may be no reason to have the exception handling there at all.)
    6. Overall, I suspect connect could be implemented more simply - probably as an independent object with instance state (and perhaps as an explicit state machine) rather than a collection of nested functions with shared, closed-over mutable objects. At this point, perhaps it would be a better idea to finish this version of the code and then come back and refactor it later, though.
  6. twisted/topfiles/4859.feature
    1. I might try to say something about selecting the address for the hostname for which a connection can be set up in the least time, but maybe it's hard to get all that information into a little fragment and what's there now is fine. Probably HostnameEndpoint.__doc__ could be made a bit more clear in either case, though. For example, a hostname with only IPv4 addresses is a perfectly good use-case for this endpoint - the faster of the two will still be selected, regardless of the fact that no IPv6 is ever involved.

Thanks again!

comment:27 Changed 15 months ago by ashfall

  • Branch changed from branches/gai-endpoint-4859-4 to branches/gai-endpoint-4859-5

(In [39245]) Branching to 'gai-endpoint-4859-5'

comment:28 in reply to: ↑ 26 ; follow-up: Changed 15 months ago by ashfall

  • Keywords review added
  • Owner ashfall deleted

Replying to exarkun:

Thanks for the review. Forced builds.

  1. twisted/internet/address.py
    1. HostnameAddress needs to decide if its hostname attribute is a byte string or a text string. New code mostly needs to avoid str and "" and instead go with either bytes and b"" or unicode and u"". I suspect hostname is a byte string, but I haven't thought about it too hard yet.

I went with bytes and b"".

  1. twisted/internet/endpoints.py
    1. There is no test coverage for the not pending condition in checkDone
    2. There is no test coverage for the not successful condition in checkDone
    3. There is no test coverage for address families other than AF_INET and AF_INET6 coming back from getaddrinfo in _endpoints (something that probably won't happen very often in practice, but it should be easy to test and skipping such things is somewhat important behavior)
    4. There's no test coverage for lc.running being False in afterConnectionAttempt

5.1, 5.2, and 5.3 are addressed in test_endpointConnectFailureAfterIteration and test_endpointConnectSuccessAfterIteration. It was suggested that some refactoring could be done to isolate the tests for a single aspect of the behavior, avoid refactoring, and strive for one-assertion-per-test.

A few questions about that: Would a separate test case for these tests look better? If not, then should I add the same tests to different test cases, even though they would be testing pretty much the same thing again?

  1. Overall, I suspect connect could be implemented more simply - probably as an independent object with instance state (and perhaps as an explicit state machine) rather than a collection of nested functions with shared, closed-over mutable objects. At this point, perhaps it would be a better idea to finish this version of the code and then come back and refactor it later, though.

I'll file a ticket once this is ready to merge.

comment:29 in reply to: ↑ 28 Changed 15 months ago by ashfall

  1. twisted/internet/endpoints.py
    1. There is no test coverage for the not pending condition in checkDone
    2. There is no test coverage for the not successful condition in checkDone
    3. There is no test coverage for address families other than AF_INET and AF_INET6 coming back from getaddrinfo in _endpoints (something that probably won't happen very often in practice, but it should be easy to test and skipping such things is somewhat important behavior)
    4. There's no test coverage for lc.running being False in afterConnectionAttempt

5.1, 5.2, and 5.3 are addressed in test_endpointConnectFailureAfterIteration and test_endpointConnectSuccessAfterIteration. It was suggested that some refactoring could be done to isolate the tests for a single aspect of the behavior, avoid refactoring, and strive for one-assertion-per-test.

Sorry, that should be 5.1, 5.2, and 5.4.

comment:30 Changed 14 months ago by exarkun

  • Keywords review removed
  • Owner set to ashfall

Thanks for pressing on with this. :)

  1. There are some merge conflicts to resolve. :(
  2. Please add this endpoint to the endpoints documentation
  3. For the docstring of HostnameEndpoint.connect, instead of saying "a connection that is the fastest" I would say something like "the connection which is established first" (to avoid implying anything about the data rate of the resulting connection).
  4. In attemptConnection, in the nested function checkDone, please use () to split the expression across multiple lines, rather than \.
  5. In the test suite, I think there should be a test similar to test_freeFunctionDeferToThread for the getaddrinfo attribute (though I expect it will pass without changes to the implementation).

Otherwise, I think this is in good shape. Please address these points and then merge.

comment:31 Changed 14 months ago by ashfall

  • Branch changed from branches/gai-endpoint-4859-5 to branches/gai-endpoint-4859-6

(In [39679]) Branching to 'gai-endpoint-4859-6'

comment:32 Changed 14 months ago by ashfall

  • Resolution set to fixed
  • Status changed from new to closed

(In [39702]) Merge gai-endpoint-4859-6: Add HostnameEndpoint, a TCP client client endpoint that connects to a hostname as quickly as possible

Author: ashfall
Reviewers: exarkun, tomprince
Fixes: #4859

comment:33 Changed 14 months ago by ashfall

Filed follow up tickets - #6697 and #6698

Note: See TracTickets for help on using tickets.