Changes between and of Initial VersionVersion 15Ticket #4859


Ignore:
Timestamp:
07/19/2012 12:34:36 PM (2 years ago)
Author:
glyph
Comment:

Updated the description with a more expansive spec. Hopefully this will be useful to ashfall. Others should feel free to expand upon it.

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #4859

    • Property Keywords ipv6 added
    • Property Owner set to ashfall
    • Property Cc twistedmatrix.com@… added
  • Ticket #4859 – Description

    initial v15  
    1 When a user enters a hostname, and possibly also a port number and / or service description, either implicitly or implicitly, they just want the connection to work.  They probably don't care about IPv4/v6 distinctions, or round-robin DNS, or how the network is set up, they just have a vague idea of where the service is and it should work. 
     1= The Goal = 
    22 
    3 The right way to do this would be to do the following: 
     3The user wants to enter the name of a particular host, and connect as quickly as 
     4possible. They may also want to enter a port number or service name. 
    45 
    5   - use getaddrinfo to resolve the name/service all the way, or, if we want a "native" implementation of this, use DNS (which I think is basically the same): 
    6     - maybe first do an SRV lookup to determine where a given service is hosted.  (The service type for this lookup might be information implicitly provided by the application rather than the user, but for an NetCat-style tool the user might want to enter it too.) 
    7     - do a DNS lookup to get all A/AAAA/CNAME records for the name in question. 
    8     - maybe some mDNS too! 
    9   - Simultaneously fire off connection attempts for all of the unique resulting IP/port combinations.  (For some protocols or some networks, I suppose you might want to do this in serial instead, but doing it in parallel is likely to be faster.)  When the first one comes back, that's your winner; deliver the connection notification to the application. 
    10   - If they all fail, report a failure when the last one fails. 
     6The application developer wants to let the user do that, and just use a 
     7simple-to-construct endpoint to do all the work involved in that. 
    118 
    12 I could swear there's an RFC (or possibly several) which describes this, but I can't find it.  The `getaddrinfo` vaguely describes a similar approach but provides a lame implementation. 
     9= The Problems = 
     10 
     11Name resolution and routing are not always sensibly connected.  In particular, 
     12it is very common for networks to automatically configure their 
     13clients with local IPv6 addresses and happily resolve remote IPv6 addresses, but be 
     14misconfigured in such a way as to not route IPv6 past the border gateway.  It 
     15isn't even that unusual for the network, or a particular host on it, to publish 
     16an internal IPv6 address that, for whatever reason, won't even respond to IPv6 
     17locally. 
     18 
     19The fact that it doesn't route IPv6 at all means that you don't get any feedback 
     20that your connection attempt isn't working besides the eventual timeout from 
     21your first SYN packet. 
     22 
     23Of course, IPv6 isn't the only reason a network or nameserver may be 
     24misconfigured in this way.  Un-connectable hosts happen all the time; it's just 
     25that this is a particularly common problem that one hits when talking about 
     26switching from a naive IPv4 configuration connection to a more sophisticated 
     27multi-address-family approach.  Really though, even if you're doing IPv4 
     28correctly, you'll hit it sometimes. 
     29 
     30= The Solution = 
     31 
     32We should follow [http://tools.ietf.org/html/rfc3493 the relevant specification] 
     33and resolve all possible connectable addresses under the given host name / 
     34service name combination using `getaddrinfo`.  (While we should not rule out a 
     35truly asynchronous version of `getaddrinfo`, this involves trying to parse a lot 
     36of platform-specific policy and it would be best to keep that work separate.) 
     37 
     38Then, as said specification suggests, we should attempt to connect to them in 
     39the order in which they are returned, as that is the preferred order.  However, 
     40as some addresses may not respond promptly enough, we should initiate several 
     41attempts in parallel. 
     42 
     43If everything's working properly, the first attempt will complete quickly and we 
     44won't even make the second one.  If there's a little bit of lag, the first 
     45attempt should still have an advantage over the second by virtue of the fact 
     46that it initiated faster and lag should affect them equally, it'll complete 
     47first, and we will cancel the second one. 
     48 
     49In the case that one or more of the addresses is going to time out for some 
     50reason, the user won't have to wait for every one to time out in turn; they'll 
     51be timing out in parallel. 
     52 
     53In order to conserve resources, and to avoid bugs where user code gets invoked 
     54twice, once one connection attempt has succeded, we should cancel all the 
     55outstanding ones. 
     56 
     57It would be useful to represent this internally as one unit which converts the 
     58hostname/service pair into a list of endpoints, and then a separate unit which 
     59implements connecting in parallel to a list of endpoints.  It may be useful in 
     60the future to expand the name-resolution portion of this to generate endpoints 
     61which do something custom (for example: resolve "hostnames" by looking at an 
     62OpenSSH format `ssh_config` file with `Host` lines in it, then doing the process 
     63recursively to resolve the real underlying hostnames and using `conch` to 
     64actually connect).