Ticket #4362 enhancement new

Opened 3 years ago

Last modified 4 months ago

Introduce a new resolver interface (for use by the reactor) which supports multiple addresses and multiple address families

Reported by: exarkun Owned by:
Priority: normal Milestone:
Component: core Keywords: ipv6
Cc: jknight, tom.most@… Branch: branches/getaddrinfo-4362-2
(diff, github, buildbot, log)
Author: exarkun, rwall Launchpad Bug:

Description

The current resolver interface, IResolverSimple.getHostByName, is modeled on gethostbyname(3) and inherits the limitations of that API:

  • Only one address can be returned
  • The desired address family cannot be specified

IResolverSimple tries to account for the latter by indicating that IPv6 addresses will be returned in preference to IPv4 addresses, but this leads to bugs such as #4212, since the reactor can't actually do anything with an IPv6 address.

getaddrinfo(3) was introduced to address the shortcomings of gethostbyname(3). It allows multiple addresses to be returned, and includes information about the address family of each (although this may be redundant with the address itself, since an ipv4 literal and an ipv6 literal can be distinguished from each other anyway).

It also mixes in getservbyname(3) functionality, accepting a service name and including a port number as an attribute of each element in the result.

We should introduce an interface like getaddrinfo(3) and switch the reactor to prefer it. We can register an adapter from the old interface to continue to support users of it. The new interface will eventually provide better IPv6 functionality, though, and if used will avoid issues like #4212.

Here's a proposal for the actual interface:

class INameResolver(Interface):
    def getAddressInformation(name, service, family=None, socktype=None, proto=None, flags=None):
        """
        Look up the address information associated with the given name.

        @param name: A hostname to resolve.
        @type name: C{str}

        @param service: A port number or the name of the service for which to find address
            information.  For example, C{22} or C{"ssh"}.
        @type service: C{int} or C{str}

        @param family: If specified, limit results to addresses from this family.  Must be one
            of the address family constants from the socket module.  For example,
            L{socket.AF_INET}.

        @param socktype: If specified, limit results to addresses for this socket type.  Must be
            one of the socket type constants from the socket module.  For example,
            L{socket.SOCK_STREAM}.

        @param proto: If specified, limit results to addresses for this socket protocol.  Must
            be one of the protocol constants from the socket module.  For example,
            L{socket.IPPROTO_TCP}.

        @param flags: A bitvector specifying zero or more of the following::

            - Yea right.  Go read the `getaddrinfo(3)` man page.

        @raise ValueError: If one of the specified flags is not supported by the
            implementation.  All flags are optional.

        @return: A L{Deferred} which will fire when the resolution completes.  If resolution is
            successful, the result will be a list of objects with the following attributes:

            - C{family}: the family of this address
            - C{type}: the type of this address
            - C{protocol}: the protocol of this address
            - C{canonicalName}: the canonical name associated with this address, or an empty
                                string.
            - C{address}: The actual address itself, including a port number.  This is suitable
                          to be passed directly to L{socket.socket.connect}.
        """

Attachments

getaddrinfo-4362-3-1.patch Download (14.4 KB) - added by rwall 4 months ago.
Merged forward and resolved conflicts with trunk - contains broken tests

Change History

1

  Changed 3 years ago by exarkun

  • branch set to branches/getaddrinfo-4362
  • branch_author set to exarkun

(In [28686]) Branching to 'getaddrinfo-4362'

2

  Changed 3 years ago by exarkun

Got a reasonable start on this. Some things I know are left to do:

  1. Finish testing all the cases of _ResolverComplexifier
  2. Add something like ThreadedResolver for INameResolver
  3. Make reactor.resolver be an IResolverSimple again
  4. Add an adapter from INameResolver to IResolverSimple to preserve backwards compatibility of reactor.resolver
  5. Think about a new reactor name resolution interface (and think about not providing one).
  6. Deprecate IReactorCore.resolve
  7. Deprecate reactor.resolver

3

  Changed 2 years ago by <automation>

  • owner exarkun deleted

4

  Changed 2 years ago by exarkun

#4912 was a duplicate of this.

5

  Changed 2 years ago by jknight

  • keywords ipv6 added

6

  Changed 15 months ago by glyph

I think a Deferred is the wrong type for this new API to return.

One of the deficiencies of getaddrinfo is that it blocks. Obviously a Deferred resolves that. But another deficiency of getaddrinfo is that it doesn't complete (blocking or not) until all of its results are ready; certain types of requests (for example: mdns) may take significantly longer than others, and it would be great to get a successful result quickly if that is possible. Platform-specific interfaces such as  getaddrinfo_a and  CFHostStartInfoResolution can invoke callbacks for each name-resolution attempt, whether it succeeded or failed.

7

follow-up: ↓ 8   Changed 15 months ago by jknight

I disagree. I don't think it makes sense or is useful to return partial results for one name look up. Getaddrinfo_a doesn't. It takes a *list* of hostnames to resolve. This API doesn't and shouldn't do that, so a Deferred is sufficient.

8

in reply to: ↑ 7   Changed 15 months ago by glyph

Replying to jknight:

I disagree. I don't think it makes sense or is useful to return partial results for one name look up. Getaddrinfo_a doesn't. It takes a *list* of hostnames to resolve. This API doesn't and shouldn't do that, so a Deferred is sufficient.

OK. Thanks for pointing out that detail; looking at the definition of struct gaicb now it is obvious, of course. I guess a Deferred that fires a list is the right structure after all.

9

  Changed 15 months ago by exarkun

Asynchronous stuff is hard in C. It's easier in Twisted, I hope. "glibc can't do incremental asynchronous resolution" sounds like a bad justification for "Twisted shouldn't do incremental asynchronous resolution".

Maybe there's another reason, though?

Changed 4 months ago by rwall

Merged forward and resolved conflicts with trunk - contains broken tests

10

  Changed 4 months ago by rwall

  • status changed from new to assigned
  • owner set to rwall

11

  Changed 4 months ago by rwall

  • branch changed from branches/getaddrinfo-4362 to branches/getaddrinfo-4362-2
  • branch_author changed from exarkun to exarkun, rwall

(In [37217]) Branching to 'getaddrinfo-4362-2'

12

  Changed 4 months ago by rwall

(In [37218]) merge forward by applying getaddrinfo-4362-3-1.patch. refs #4362

13

  Changed 4 months ago by rwall

14

  Changed 4 months ago by rwall

(In [37251]) Made test_failure pass. Instead of passing an explicit failure to FakeResolver, just request an unknown name.

Also some smaller changes...

  • Various small docstring improvements
  • Fix a twistedchecker comment init caps warning.
  • Add doc strings for FakeResolver
  • Fix long line
  • Add docstrings for MemoryNameResolver
  • More complete docstrings for _ResolverComplexifier
  • Add docstrings for test_base.py
  • Add docstrings for AddressInformation
  • Added a placeholder ThreadedNameResolver to make the tests happier.

refs #4362

15

  Changed 4 months ago by rwall

  • status changed from assigned to new
  • owner rwall deleted

Hi exarkun,

I've done a bit more work on this ticket but now I need some guidance.

First here's what I've done on top of you original branch (in log:branches/getaddrinfo-4362-2)

  1. Merged forward resolving some conflicts
  2. Added ThreadedNameResolver + tests
  3. Added a timeout parameter to INameResolver.getAddressInformation
  4. Registered an IResolverSimple to INameResolver adapter.

I think that covers points 1 and 2 in your comment:2 (above)

I have some questions about your other points.

Make reactor.resolver be an IResolverSimple again

Is this for backwards compatibility? So instead of adapting the resolver in reactor.installResolver... we would do the adaption inside reactor.resolve everytime it's called. or not at all.

Add an adapter from INameResolver to IResolverSimple to preserve backwards compatibility of reactor.resolver

So maybe you meant

def installResolver(self, resolver):
    self._resolver = INameResolver(resolver)

reactor.resolver then becomes a descriptor which issues a deprecation warning and returns IResolverSimple(reactor._resolver)

Think about a new reactor name resolution interface (and think about not providing one).

So reactor.resolve must continue to return an IP address (as per IReactorCore)

In which case why bother changing any of the reactor code in this branch.

Why can't this branch simply be about adding INameResolver, ThreadedNameReactor, and optionally the deprecations below...

Then decide on the reactor interface based on the usage of getaddrinfo in #4859.

  • Deprecate IReactorCore.resolve
  • Deprecate reactor.resolver

At which point I think I understand why you paused work on this branch in the first place :)

Anyway. It's been interesting to learn a little more about the gai etc.

If you have any time to comment I'd be interested hear what you think.

-RichardW.

16

  Changed 4 months ago by twm

  • cc tom.most@… added
Note: See TracTickets for help on using tickets.