id	summary	reporter	owner	description	type	status	priority	milestone	component	resolution	keywords	cc	branch	branch_author	launchpad_bug
4859	client endpoint: super-smart name-based TCP connection algorithm	glyph	exarkun	"= The Goal =

The user wants to enter the name of a particular host, and connect as quickly as
possible. They may also want to enter a port number or service name.

The application developer wants to let the user do that, and just use a
simple-to-construct endpoint to do all the work involved in that.

= The Problems =

Name resolution and routing are not always sensibly connected.  In particular,
it is very common for networks to automatically configure their
clients with local IPv6 addresses and happily resolve remote IPv6 addresses, but be
misconfigured in such a way as to not route IPv6 past the border gateway.  It
isn't even that unusual for the network, or a particular host on it, to publish
an internal IPv6 address that, for whatever reason, won't even respond to IPv6
locally.

The fact that it doesn't route IPv6 at all means that you don't get any feedback
that your connection attempt isn't working besides the eventual timeout from
your first SYN packet.

Of course, IPv6 isn't the only reason a network or nameserver may be
misconfigured in this way.  Un-connectable hosts happen all the time; it's just
that this is a particularly common problem that one hits when talking about
switching from a naive IPv4 configuration connection to a more sophisticated
multi-address-family approach.  Really though, even if you're doing IPv4
correctly, you'll hit it sometimes.

= The Solution =

We should follow [http://tools.ietf.org/html/rfc3493 the relevant specification]
and resolve all possible connectable addresses under the given host name /
service name combination using `getaddrinfo`.  (While we should not rule out a
truly asynchronous version of `getaddrinfo`, this involves trying to parse a lot
of platform-specific policy and it would be best to keep that work separate.)

Then, as said specification suggests, we should attempt to connect to them in
the order in which they are returned, as that is the preferred order.  However,
as some addresses may not respond promptly enough, we should initiate several
attempts in parallel.

If everything's working properly, the first attempt will complete quickly and we
won't even make the second one.  If there's a little bit of lag, the first
attempt should still have an advantage over the second by virtue of the fact
that it initiated faster and lag should affect them equally, it'll complete
first, and we will cancel the second one.

In the case that one or more of the addresses is going to time out for some
reason, the user won't have to wait for every one to time out in turn; they'll
be timing out in parallel.

In order to conserve resources, and to avoid bugs where user code gets invoked
twice, once one connection attempt has succeded, we should cancel all the
outstanding ones.

It would be useful to represent this internally as one unit which converts the
hostname/service pair into a list of endpoints, and then a separate unit which
implements connecting in parallel to a list of endpoints.  It may be useful in
the future to expand the name-resolution portion of this to generate endpoints
which do something custom (for example: resolve ""hostnames"" by looking at an
OpenSSH format `ssh_config` file with `Host` lines in it, then doing the process
recursively to resolve the real underlying hostnames and using `conch` to
actually connect).
"	enhancement	assigned	normal		core		endpoint ipv6 review	twistedmatrix.com@…	branches/gai-endpoint-4859-4	ashfall	
