#4859 enhancement closed fixed (fixed)
client endpoint: super-smart name-based TCP connection algorithm
Reported by: | Glyph | Owned by: | ashfall |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | core | Keywords: | endpoint ipv6 |
Cc: | ralphm | Branch: |
branches/gai-endpoint-4859-6
branch-diff, diff-cov, branch-cov, buildbot |
Author: | ashfall |
Description (last modified by )
The Goal
The user wants to enter the name of a particular host, and connect as quickly as possible. They may also want to enter a port number or service name.
The application developer wants to let the user do that, and just use a simple-to-construct endpoint to do all the work involved in that.
The Problems
Name resolution and routing are not always sensibly connected. In particular, it is very common for networks to automatically configure their clients with local IPv6 addresses and happily resolve remote IPv6 addresses, but be misconfigured in such a way as to not route IPv6 past the border gateway. It isn't even that unusual for the network, or a particular host on it, to publish an internal IPv6 address that, for whatever reason, won't even respond to IPv6 locally.
The fact that it doesn't route IPv6 at all means that you don't get any feedback that your connection attempt isn't working besides the eventual timeout from your first SYN packet.
Of course, IPv6 isn't the only reason a network or nameserver may be misconfigured in this way. Un-connectable hosts happen all the time; it's just that this is a particularly common problem that one hits when talking about switching from a naive IPv4 configuration connection to a more sophisticated multi-address-family approach. Really though, even if you're doing IPv4 correctly, you'll hit it sometimes.
The Solution
We should follow the relevant specification
and resolve all possible connectable addresses under the given host name /
service name combination using getaddrinfo
. (While we should not rule out a
truly asynchronous version of getaddrinfo
, this involves trying to parse a lot
of platform-specific policy and it would be best to keep that work separate.)
Then, as said specification suggests, we should attempt to connect to them in the order in which they are returned, as that is the preferred order. However, as some addresses may not respond promptly enough, we should initiate several attempts in parallel.
If everything's working properly, the first attempt will complete quickly and we won't even make the second one. If there's a little bit of lag, the first attempt should still have an advantage over the second by virtue of the fact that it initiated faster and lag should affect them equally, it'll complete first, and we will cancel the second one.
In the case that one or more of the addresses is going to time out for some reason, the user won't have to wait for every one to time out in turn; they'll be timing out in parallel.
In order to conserve resources, and to avoid bugs where user code gets invoked twice, once one connection attempt has succeded, we should cancel all the outstanding ones.
It would be useful to represent this internally as one unit which converts the
hostname/service pair into a list of endpoints, and then a separate unit which
implements connecting in parallel to a list of endpoints. It may be useful in
the future to expand the name-resolution portion of this to generate endpoints
which do something custom (for example: resolve "hostnames" by looking at an
OpenSSH format ssh_config
file with Host
lines in it, then doing the process
recursively to resolve the real underlying hostnames and using conch
to
actually connect).
Change History (33)
comment:1 Changed 8 years ago by
comment:2 Changed 8 years ago by
The order in which you're supposed to try connecting is the order they're returned by getaddrinfo, because it sorts them by various criteria (like being on your local network).
It might be interesting to not wait for the entire connect() timeout before trying another host, but certainly trying all in parallel would be a bad idea. It would be a very not nice thing to do to the servers.
And yea, just use getaddrinfo, by default. SRV lookup needs to be a separate API or at least a default-off flag, because you're not allowed to use SRV records for most protocols (but of course are required to use it for a few).
comment:3 Changed 8 years ago by
Some discussion on dns-operations included a suggestion to try them in staggered parallel. Wait ~RTT before trying the next address, but then try it even if the previous attempt hasn't failed yet. (Calculating the RTT for any arbitrary address is left as an exercise for the reader.)
comment:4 Changed 8 years ago by
comment:5 Changed 8 years ago by
Keywords: | ipv6 added |
---|
comment:6 Changed 7 years ago by
Is this a duplicate of #4362 ? All the suggestions here (except the connection behavior changes) are related to resolver API, and the connection behavior changes would perhaps better be addressed in a ticket specifically for implementing behavior documented in draft-ietf-v6ops-happy-eyeballs-07
comment:7 Changed 7 years ago by
The ticket description does seem to mix two related but orthogonal things together.
#4362 will provide the mechanism for doing getaddrinfo
lookups. This ticket can use the APIs implemented for that ticket to implement an actual client endpoint to do the connection.
The happy-eyeballs draft seems to largely cover what this ticket is trying to accomplish, leaving all of the address resolution to a getaddrinfo
implementation. So perhaps this is the ticket for implementing that behavior.
comment:8 Changed 7 years ago by
I kinda want to do this first; specifically, I'd like to implement it in advance of #4362. I think that we can have something that just does callInThread(getaddrinfo, ...)
internal to the endpoint, and then later, smoothly upgrade to the version that uses a smarter, fancier, reactor name-resolution interface. This could trivially even be made compatible with older third-party reactors by doing IReactorReallyFancyResolver.providedBy(self.reactor)
.
Thoughts?
comment:9 Changed 7 years ago by
I implemented this for Calendar Server, although it is definitely missing some test coverage and probably missing a bunch of necessary knobs and customizations. It may be useful as an initial point of reference, however.
comment:10 Changed 7 years ago by
Owner: | set to ashfall |
---|
comment:11 Changed 7 years ago by
Cc: | ralphm added |
---|
comment:12 follow-up: 13 Changed 7 years ago by
A delay of 0 sec between connection attempts (like the above patch uses) is not great.
Think of people doing traditional DNS round-robin.
The delay can be a tunable number, but should default to something less aggressive. Either infinity (wait for each connect to timeout before trying next one), or at the smallest, something large enough that you could expect most reasonable networks to have connected by then. Say, 1 second.
comment:13 follow-up: 14 Changed 7 years ago by
Replying to jknight:
A delay of 0 sec between connection attempts (like the above patch uses) is not great.
Oh wow, yes, that is terrible. I think I had changed that for debugging and I just never set it back. Thanks for pointing it out.
Think of people doing traditional DNS round-robin.
The delay can be a tunable number, but should default to something less aggressive. Either infinity (wait for each connect to timeout before trying next one), or at the smallest, something large enough that you could expect most reasonable networks to have connected by then. Say, 1 second.
99.9% agreement.
The only thing I would add is that I wish I could find some kind of official document explaining that 1 second is the One True Value for this, or explaining some sort of backoff algorithm, along with a rationale for why it works well and doesn't promote congestion and bufferbloat and all the ills of the world.
comment:14 Changed 7 years ago by
Replying to glyph:
The only thing I would add is that I wish I could find some kind of official document explaining that 1 second is the One True Value for this, or explaining some sort of backoff algorithm, along with a rationale for why it works well and doesn't promote congestion and bufferbloat and all the ills of the world.
It looks like the draft mentioned above is now an RFC: RFC 6555, "Happy Eyeballs: Success with Dual-Stack Hosts". I haven't had time to thoroughly evaluate it but it sounds authoritative, if not necessarily useful :).
comment:15 Changed 7 years ago by
Description: | modified (diff) |
---|
Updated the description with a more expansive spec. Hopefully this will be useful to ashfall. Others should feel free to expand upon it.
comment:16 Changed 7 years ago by
Author: | → ashfall |
---|---|
Branch: | → branches/gai-endpoint-4859 |
(In [35028]) Branching to 'gai-endpoint-4859'
comment:17 Changed 7 years ago by
Branch: | branches/gai-endpoint-4859 → branches/gai-endpoint-4859-2 |
---|
(In [35048]) Branching to 'gai-endpoint-4859-2'
comment:18 Changed 6 years ago by
Branch: | branches/gai-endpoint-4859-2 → branches/gai-endpoint-4859-3 |
---|
(In [37360]) Branching to 'gai-endpoint-4859-3'
comment:19 Changed 6 years ago by
Keywords: | review added |
---|---|
Owner: | ashfall deleted |
So the branch passes most builds, and I think is ready for a review. Build Results
A few points to consider while reviewing:
- In
HostnameEndpoint.connect._endpoint
, what should be the default return value when an address is neither IPv4 nor IPv6?
- The names of the various attributes of
HostnameEndpoint
, should you have a better naming suggestion.
- Does
HostnameEndpoint
go in__all3__
? (Note that python-3.3-tests fail, not sure how to fix that.)
twisted.internet.test.test_endpoints
is now importingMemoryReactorWithConnectorsAndTime
asMemoryReactor
, becauseHostnameEndpoint
requires clock and schedules timed calls.
comment:20 follow-up: 22 Changed 6 years ago by
Keywords: | review removed |
---|---|
Owner: | set to ashfall |
(Note that python-3.3-tests fail, not sure how to fix that.)
You fix that by moving the object
to the end of the base class list. (It should always be at the end)
In HostnameEndpoint.connect._endpoint, what should be the default return value when an address is neither IPv4 nor IPv6?
I don't think any current implementation of getaddrinfo
returns anything else, and if it did, we don't know what to do with it, so we can just ignore it.
getaddrinfo
- An endpoint only supports connecting to stream transports, not datagram transports, so we should tell
getaddrinfo
that. getaddrinfo
will resolve service names, so instead of passingport
directly to toconnectTCP
, we should pass it togetaddrinfo
and use the resolved port when trying to connect.
- An endpoint only supports connecting to stream transports, not datagram transports, so we should tell
- Instead of
you can instead just do
receivedFailures = [] d.addErrback(receivedFailures.append) ... self.assertEqual(len(receivedFailures), 1) failure = receivedFailures[0]
... failure = self.failureResultOf(d)
- You don't need to duplicate
test_nameResolution
. The second one isn't exercising any more code. - The
DNSLookupError
should contain the hostname (and given 1.2, the service name). HostnameEndpoint.__init__
:host
doesn't need to be something that has an IPv6 address. The point of this is to have a way of connecting that doesn't care about address families. Something like "host name to connect to"? If you want to be more detailed, it needs to be something suitable to pass togetaddrinfo
.- Given 1.2, there should be a
service
instead ofport
argument, which will need documentation. timeout
probably needs to be documented here, since it is the timeout for each individual connection, rather than some total timeout.
- (optional) Is there any reason that
_canceller
is method rather than a local function inconnect
like all the other callbacks? (In particular,_pending
could then be made local toconnect
as well) - (minor)
almostDone
would be better namedmaybeDone
orcheckDone
.
Please resubmit for review after addressing the above points.
comment:22 follow-up: 23 Changed 6 years ago by
Keywords: | review added |
---|---|
Owner: | ashfall deleted |
Thanks for the review! New build results
Addressed all points, except 5.2.
Replying to tom.prince:
- Given 1.2, there should be a
service
instead ofport
argument, which will need documentation.
Following a discussion on IRC:
<exarkun> I think the review comment is out of scope, the endpoint should be added with integer port support, and someone can figure out a coherent specification for an additional feature in a new followup ticket.
I'll file the said followup ticket once this is ready to merge. Let me know if you feel otherwise.
comment:23 Changed 6 years ago by
Replying to ashfall:
Thanks for the review! New build results
Addressed all points, except 5.2.
Replying to tom.prince:
- Given 1.2, there should be a
service
instead ofport
argument, which will need documentation.Following a discussion on IRC:
<exarkun> I think the review comment is out of scope, the endpoint should be added with integer port support, and someone can figure out a coherent specification for an additional feature in a new followup ticket.
Just wanted to apologize for piling on in a way which suggested I thought this needed to be implemented on IRC: exarkun is totally right here, we can add service-name resolution later. (And since, as discussed above, getaddrinfo
does not do more interesting stuff like SRV lookups, we will need to add more features later anyway.)
I'll file the said followup ticket once this is ready to merge. Let me know if you feel otherwise.
This sounds like the right decision to me as well.
comment:24 Changed 6 years ago by
Owner: | set to Jean-Paul Calderone |
---|---|
Status: | new → assigned |
comment:25 Changed 6 years ago by
Branch: | branches/gai-endpoint-4859-3 → branches/gai-endpoint-4859-4 |
---|
(In [38389]) Branching to 'gai-endpoint-4859-4'
comment:26 follow-up: 28 Changed 6 years ago by
Keywords: | review removed |
---|---|
Owner: | changed from Jean-Paul Calderone to ashfall |
Status: | assigned → new |
Thanks! Sorry about the long review delay.
- twisted/internet/address.py
HostnameAddress
needs to decide if itshostname
attribute is a byte string or a text string. New code mostly needs to avoidstr
and""
and instead go with eitherbytes
andb""
orunicode
andu""
. I suspecthostname
is a byte string, but I haven't thought about it too hard yet.- Also, you can and should use
L{}
to refer to built-in types now, rather thanC{}
compareAttributes
is missing some whitespace between its elementsHostnameAddress
appears in__all3__
but its tests haven't been added toadmin/_twistedpython3.py
so they won't run on Python 3. This isn't allowed, the tests must run if we're claiming the code is ported.
- twisted/internet/test/test_address.py
- Once you decide on a type for
HostnameEndpoint.hostname
, don't forget to update the string literals inHostnameAddressTestCase
. - Also, FWIW, I like the naming convention
HostnameAddressTests
rather thanHostnameAddressTestCase
. But Twisted has no official standard. :)
- Once you decide on a type for
- twisted/internet/endpoints.py
- Please move the new
socket
import up to the stdlib import section. HostnameEndpoint
has the same decision to make about itshost
parameter (and must come to the same conclusion, of course)@type
forhost
inHostnameEndpoint.__init__
could use someL{}
markup- It's nice to indent second and subsequent lines of a
@param
item - thetimeout
markup currently has an unindented second line. - Sadly, the definition/use of
HostnameEndpoint._deferToThread
breaksHostnameEndpoint
for real world usage. The tests are happy because they patch aHostnameEndpoint
*instance*. However, whenHostnameEndpoint
is used without having this attribute patched in that way, the free-function gets wrapped up in a bound method and gets the automatic firstself
argument passed to it. This fails of course, sincedeferToThread
does not need or want aself
argument there. This is easily fixed in a number of ways - for example,staticmethod(deferToThread)
inhibits the creation of a bound method. The tricky part is writing a unit test to prove it. :)
- Please move the new
- twisted/internet/test/test_endpoints.py
MemoryReactorWithConnectorsAndTime
no longer exists. I guess you might wanttwisted.test.proto_helpers.MemoryReactorClock
now.RaisingMemoryReactorWithClock
should callRaisingMemoryReactor.__init__
instead of setting those private attributes itself.HostnameEndpointsOneIPv4TestCase.createClientEndpoint
needs a docstringHostnameEndpointsOneIPv6TestCase
.createClientEndpoint` needs one too
- twisted/internet/endpoints.py
- There is no test coverage for the
not pending
condition incheckDone
- There is no test coverage for the
not successful
condition incheckDone
- There is no test coverage for address families other than
AF_INET
andAF_INET6
coming back fromgetaddrinfo
in_endpoints
(something that probably won't happen very often in practice, but it should be easy to test and skipping such things is somewhat important behavior) - There's no test coverage for
lc.running
beingFalse
inafterConnectionAttempt
- There's no test coverage for handling of an exception out of the lines between 774 and 778. (I suspect no relevant exceptions can come from those lines, so there may be no reason to have the exception handling there at all.)
- Overall, I suspect
connect
could be implemented more simply - probably as an independent object with instance state (and perhaps as an explicit state machine) rather than a collection of nested functions with shared, closed-over mutable objects. At this point, perhaps it would be a better idea to finish this version of the code and then come back and refactor it later, though.
- There is no test coverage for the
- twisted/topfiles/4859.feature
- I might try to say something about selecting the address for the hostname for which a connection can be set up in the least time, but maybe it's hard to get all that information into a little fragment and what's there now is fine. Probably
HostnameEndpoint.__doc__
could be made a bit more clear in either case, though. For example, a hostname with only IPv4 addresses is a perfectly good use-case for this endpoint - the faster of the two will still be selected, regardless of the fact that no IPv6 is ever involved.
- I might try to say something about selecting the address for the hostname for which a connection can be set up in the least time, but maybe it's hard to get all that information into a little fragment and what's there now is fine. Probably
Thanks again!
comment:27 Changed 6 years ago by
Branch: | branches/gai-endpoint-4859-4 → branches/gai-endpoint-4859-5 |
---|
(In [39245]) Branching to 'gai-endpoint-4859-5'
comment:28 follow-up: 29 Changed 6 years ago by
Keywords: | review added |
---|---|
Owner: | ashfall deleted |
Replying to exarkun:
Thanks for the review. Forced builds.
- twisted/internet/address.py
HostnameAddress
needs to decide if itshostname
attribute is a byte string or a text string. New code mostly needs to avoidstr
and""
and instead go with eitherbytes
andb""
orunicode
andu""
. I suspecthostname
is a byte string, but I haven't thought about it too hard yet.
I went with bytes
and b""
.
- twisted/internet/endpoints.py
- There is no test coverage for the
not pending
condition incheckDone
- There is no test coverage for the
not successful
condition incheckDone
- There is no test coverage for address families other than
AF_INET
andAF_INET6
coming back fromgetaddrinfo
in_endpoints
(something that probably won't happen very often in practice, but it should be easy to test and skipping such things is somewhat important behavior)- There's no test coverage for
lc.running
beingFalse
inafterConnectionAttempt
5.1, 5.2, and 5.3 are addressed in test_endpointConnectFailureAfterIteration
and test_endpointConnectSuccessAfterIteration
. It was suggested that some refactoring could be done to isolate the tests for a single aspect of the behavior, avoid refactoring, and strive for one-assertion-per-test.
A few questions about that: Would a separate test case for these tests look better? If not, then should I add the same tests to different test cases, even though they would be testing pretty much the same thing again?
- Overall, I suspect
connect
could be implemented more simply - probably as an independent object with instance state (and perhaps as an explicit state machine) rather than a collection of nested functions with shared, closed-over mutable objects. At this point, perhaps it would be a better idea to finish this version of the code and then come back and refactor it later, though.
I'll file a ticket once this is ready to merge.
comment:29 Changed 6 years ago by
- twisted/internet/endpoints.py
- There is no test coverage for the
not pending
condition incheckDone
- There is no test coverage for the
not successful
condition incheckDone
- There is no test coverage for address families other than
AF_INET
andAF_INET6
coming back fromgetaddrinfo
in_endpoints
(something that probably won't happen very often in practice, but it should be easy to test and skipping such things is somewhat important behavior)- There's no test coverage for
lc.running
beingFalse
inafterConnectionAttempt
5.1, 5.2, and 5.3 are addressed in
test_endpointConnectFailureAfterIteration
andtest_endpointConnectSuccessAfterIteration
. It was suggested that some refactoring could be done to isolate the tests for a single aspect of the behavior, avoid refactoring, and strive for one-assertion-per-test.
Sorry, that should be 5.1, 5.2, and 5.4.
comment:30 Changed 6 years ago by
Keywords: | review removed |
---|---|
Owner: | set to ashfall |
Thanks for pressing on with this. :)
- There are some merge conflicts to resolve. :(
- Please add this endpoint to the endpoints documentation
- For the docstring of
HostnameEndpoint.connect
, instead of saying "a connection that is the fastest" I would say something like "the connection which is established first" (to avoid implying anything about the data rate of the resulting connection). - In
attemptConnection
, in the nested functioncheckDone
, please use () to split the expression across multiple lines, rather than \. - In the test suite, I think there should be a test similar to
test_freeFunctionDeferToThread
for thegetaddrinfo
attribute (though I expect it will pass without changes to the implementation).
Otherwise, I think this is in good shape. Please address these points and then merge.
comment:31 Changed 6 years ago by
Branch: | branches/gai-endpoint-4859-5 → branches/gai-endpoint-4859-6 |
---|
(In [39679]) Branching to 'gai-endpoint-4859-6'
comment:32 Changed 6 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Circumventing or re-implementing getaddrinfo is the wrong choice.
Systems permit (are mandated by RFCs, in fact) a site-local policy reordering of the address tuples returned from getaddrinfo - see for example "man gai.conf" on a Linux system. The default sorting policy is mandated in the RFC, and may change in future and (presumably) be retrofitted by system patches. Sites may have deployed by policy changes to this. If you attempt to circumvent getaddrinfo, you will destroy this. At the very least, Twisted applications will have behavior inconsistent with the rest of the system.
Possibly the RFC you are referring to is RFC 3484, which defines address selection rules.
I understand the desire to implement clever things like SRV lookups, but please - don't get clever here. Honour the standards and do things the right way. If you have an interface that takes hostnames, use getaddrinfo.
I don't know what systems getaddrinfo is "lame" on. Perhaps you could go into more detail?