Changes between and of Initial VersionVersion 15Ticket #4859

07/19/2012 12:34:36 PM (4 years ago)

Updated the description with a more expansive spec. Hopefully this will be useful to ashfall. Others should feel free to expand upon it.


  • Ticket #4859

    • Property Keywords ipv6 added
    • Property Owner set to ashfall
    • Property Cc ralphm added
  • Ticket #4859 – Description

    initial v15  
    1 When a user enters a hostname, and possibly also a port number and / or service description, either implicitly or implicitly, they just want the connection to work.  They probably don't care about IPv4/v6 distinctions, or round-robin DNS, or how the network is set up, they just have a vague idea of where the service is and it should work.
     1= The Goal =
    3 The right way to do this would be to do the following:
     3The user wants to enter the name of a particular host, and connect as quickly as
     4possible. They may also want to enter a port number or service name.
    5   - use getaddrinfo to resolve the name/service all the way, or, if we want a "native" implementation of this, use DNS (which I think is basically the same):
    6     - maybe first do an SRV lookup to determine where a given service is hosted.  (The service type for this lookup might be information implicitly provided by the application rather than the user, but for an NetCat-style tool the user might want to enter it too.)
    7     - do a DNS lookup to get all A/AAAA/CNAME records for the name in question.
    8     - maybe some mDNS too!
    9   - Simultaneously fire off connection attempts for all of the unique resulting IP/port combinations.  (For some protocols or some networks, I suppose you might want to do this in serial instead, but doing it in parallel is likely to be faster.)  When the first one comes back, that's your winner; deliver the connection notification to the application.
    10   - If they all fail, report a failure when the last one fails.
     6The application developer wants to let the user do that, and just use a
     7simple-to-construct endpoint to do all the work involved in that.
    12 I could swear there's an RFC (or possibly several) which describes this, but I can't find it.  The `getaddrinfo` vaguely describes a similar approach but provides a lame implementation.
     9= The Problems =
     11Name resolution and routing are not always sensibly connected.  In particular,
     12it is very common for networks to automatically configure their
     13clients with local IPv6 addresses and happily resolve remote IPv6 addresses, but be
     14misconfigured in such a way as to not route IPv6 past the border gateway.  It
     15isn't even that unusual for the network, or a particular host on it, to publish
     16an internal IPv6 address that, for whatever reason, won't even respond to IPv6
     19The fact that it doesn't route IPv6 at all means that you don't get any feedback
     20that your connection attempt isn't working besides the eventual timeout from
     21your first SYN packet.
     23Of course, IPv6 isn't the only reason a network or nameserver may be
     24misconfigured in this way.  Un-connectable hosts happen all the time; it's just
     25that this is a particularly common problem that one hits when talking about
     26switching from a naive IPv4 configuration connection to a more sophisticated
     27multi-address-family approach.  Really though, even if you're doing IPv4
     28correctly, you'll hit it sometimes.
     30= The Solution =
     32We should follow [ the relevant specification]
     33and resolve all possible connectable addresses under the given host name /
     34service name combination using `getaddrinfo`.  (While we should not rule out a
     35truly asynchronous version of `getaddrinfo`, this involves trying to parse a lot
     36of platform-specific policy and it would be best to keep that work separate.)
     38Then, as said specification suggests, we should attempt to connect to them in
     39the order in which they are returned, as that is the preferred order.  However,
     40as some addresses may not respond promptly enough, we should initiate several
     41attempts in parallel.
     43If everything's working properly, the first attempt will complete quickly and we
     44won't even make the second one.  If there's a little bit of lag, the first
     45attempt should still have an advantage over the second by virtue of the fact
     46that it initiated faster and lag should affect them equally, it'll complete
     47first, and we will cancel the second one.
     49In the case that one or more of the addresses is going to time out for some
     50reason, the user won't have to wait for every one to time out in turn; they'll
     51be timing out in parallel.
     53In order to conserve resources, and to avoid bugs where user code gets invoked
     54twice, once one connection attempt has succeded, we should cancel all the
     55outstanding ones.
     57It would be useful to represent this internally as one unit which converts the
     58hostname/service pair into a list of endpoints, and then a separate unit which
     59implements connecting in parallel to a list of endpoints.  It may be useful in
     60the future to expand the name-resolution portion of this to generate endpoints
     61which do something custom (for example: resolve "hostnames" by looking at an
     62OpenSSH format `ssh_config` file with `Host` lines in it, then doing the process
     63recursively to resolve the real underlying hostnames and using `conch` to
     64actually connect).