Opened 5 years ago

Last modified 8 months ago

#4362 enhancement new

Introduce a new resolver interface (for use by the reactor) which supports multiple addresses and multiple address families

Reported by: exarkun Owned by: rwall
Priority: normal Milestone:
Component: core Keywords: ipv6
Cc: jknight, tom.most@…, thijs Branch: branches/getaddrinfo-4362-2
(diff, github, buildbot, log)
Author: exarkun, rwall Launchpad Bug:

Description

The current resolver interface, IResolverSimple.getHostByName, is modeled on gethostbyname(3) and inherits the limitations of that API:

  • Only one address can be returned
  • The desired address family cannot be specified

IResolverSimple tries to account for the latter by indicating that IPv6 addresses will be returned in preference to IPv4 addresses, but this leads to bugs such as #4212, since the reactor can't actually do anything with an IPv6 address.

getaddrinfo(3) was introduced to address the shortcomings of gethostbyname(3). It allows multiple addresses to be returned, and includes information about the address family of each (although this may be redundant with the address itself, since an ipv4 literal and an ipv6 literal can be distinguished from each other anyway).

It also mixes in getservbyname(3) functionality, accepting a service name and including a port number as an attribute of each element in the result.

We should introduce an interface like getaddrinfo(3) and switch the reactor to prefer it. We can register an adapter from the old interface to continue to support users of it. The new interface will eventually provide better IPv6 functionality, though, and if used will avoid issues like #4212.

Here's a proposal for the actual interface:

class INameResolver(Interface):
    def getAddressInformation(name, service, family=None, socktype=None, proto=None, flags=None):
        """
        Look up the address information associated with the given name.

        @param name: A hostname to resolve.
        @type name: C{str}

        @param service: A port number or the name of the service for which to find address
            information.  For example, C{22} or C{"ssh"}.
        @type service: C{int} or C{str}

        @param family: If specified, limit results to addresses from this family.  Must be one
            of the address family constants from the socket module.  For example,
            L{socket.AF_INET}.

        @param socktype: If specified, limit results to addresses for this socket type.  Must be
            one of the socket type constants from the socket module.  For example,
            L{socket.SOCK_STREAM}.

        @param proto: If specified, limit results to addresses for this socket protocol.  Must
            be one of the protocol constants from the socket module.  For example,
            L{socket.IPPROTO_TCP}.

        @param flags: A bitvector specifying zero or more of the following::

            - Yea right.  Go read the `getaddrinfo(3)` man page.

        @raise ValueError: If one of the specified flags is not supported by the
            implementation.  All flags are optional.

        @return: A L{Deferred} which will fire when the resolution completes.  If resolution is
            successful, the result will be a list of objects with the following attributes:

            - C{family}: the family of this address
            - C{type}: the type of this address
            - C{protocol}: the protocol of this address
            - C{canonicalName}: the canonical name associated with this address, or an empty
                                string.
            - C{address}: The actual address itself, including a port number.  This is suitable
                          to be passed directly to L{socket.socket.connect}.
        """

Attachments (1)

getaddrinfo-4362-3-1.patch (14.4 KB) - added by rwall 20 months ago.
Merged forward and resolved conflicts with trunk - contains broken tests

Download all attachments as: .zip

Change History (19)

comment:1 Changed 5 years ago by exarkun

  • Author set to exarkun
  • Branch set to branches/getaddrinfo-4362

(In [28686]) Branching to 'getaddrinfo-4362'

comment:2 Changed 5 years ago by exarkun

Got a reasonable start on this. Some things I know are left to do:

  1. Finish testing all the cases of _ResolverComplexifier
  2. Add something like ThreadedResolver for INameResolver
  3. Make reactor.resolver be an IResolverSimple again
  4. Add an adapter from INameResolver to IResolverSimple to preserve backwards compatibility of reactor.resolver
  5. Think about a new reactor name resolution interface (and think about not providing one).
  6. Deprecate IReactorCore.resolve
  7. Deprecate reactor.resolver

comment:3 Changed 4 years ago by <automation>

  • Owner exarkun deleted

comment:4 Changed 4 years ago by exarkun

#4912 was a duplicate of this.

comment:5 Changed 3 years ago by jknight

  • Keywords ipv6 added

comment:6 Changed 2 years ago by glyph

I think a Deferred is the wrong type for this new API to return.

One of the deficiencies of getaddrinfo is that it blocks. Obviously a Deferred resolves that. But another deficiency of getaddrinfo is that it doesn't complete (blocking or not) until all of its results are ready; certain types of requests (for example: mdns) may take significantly longer than others, and it would be great to get a successful result quickly if that is possible. Platform-specific interfaces such as getaddrinfo_a and CFHostStartInfoResolution can invoke callbacks for each name-resolution attempt, whether it succeeded or failed.

comment:7 follow-up: Changed 2 years ago by jknight

I disagree. I don't think it makes sense or is useful to return partial results for one name look up. Getaddrinfo_a doesn't. It takes a *list* of hostnames to resolve. This API doesn't and shouldn't do that, so a Deferred is sufficient.

comment:8 in reply to: ↑ 7 Changed 2 years ago by glyph

Replying to jknight:

I disagree. I don't think it makes sense or is useful to return partial results for one name look up. Getaddrinfo_a doesn't. It takes a *list* of hostnames to resolve. This API doesn't and shouldn't do that, so a Deferred is sufficient.

OK. Thanks for pointing out that detail; looking at the definition of struct gaicb now it is obvious, of course. I guess a Deferred that fires a list is the right structure after all.

comment:9 Changed 2 years ago by exarkun

Asynchronous stuff is hard in C. It's easier in Twisted, I hope. "glibc can't do incremental asynchronous resolution" sounds like a bad justification for "Twisted shouldn't do incremental asynchronous resolution".

Maybe there's another reason, though?

Changed 20 months ago by rwall

Merged forward and resolved conflicts with trunk - contains broken tests

comment:10 Changed 20 months ago by rwall

  • Owner set to rwall
  • Status changed from new to assigned

comment:11 Changed 20 months ago by rwall

  • Author changed from exarkun to exarkun, rwall
  • Branch changed from branches/getaddrinfo-4362 to branches/getaddrinfo-4362-2

(In [37217]) Branching to 'getaddrinfo-4362-2'

comment:12 Changed 20 months ago by rwall

(In [37218]) merge forward by applying getaddrinfo-4362-3-1.patch. refs #4362

comment:14 Changed 19 months ago by rwall

(In [37251]) Made test_failure pass. Instead of passing an explicit failure to
FakeResolver, just request an unknown name.

Also some smaller changes...

  • Various small docstring improvements
  • Fix a twistedchecker comment init caps warning.
  • Add doc strings for FakeResolver
  • Fix long line
  • Add docstrings for MemoryNameResolver
  • More complete docstrings for _ResolverComplexifier
  • Add docstrings for test_base.py
  • Add docstrings for AddressInformation
  • Added a placeholder ThreadedNameResolver to make the tests happier.

refs #4362

comment:15 follow-ups: Changed 19 months ago by rwall

  • Owner rwall deleted
  • Status changed from assigned to new

Hi exarkun,

I've done a bit more work on this ticket but now I need some
guidance.

First here's what I've done on top of you original branch (in
log:branches/getaddrinfo-4362-2)

  1. Merged forward resolving some conflicts
  2. Added ThreadedNameResolver + tests
  3. Added a timeout parameter to INameResolver.getAddressInformation
  4. Registered an IResolverSimple to INameResolver adapter.

I think that covers points 1 and 2 in your comment:2 (above)

I have some questions about your other points.

Make reactor.resolver be an IResolverSimple again

Is this for backwards compatibility?
So instead of adapting the resolver in reactor.installResolver...
we would do the adaption inside reactor.resolve everytime it's
called.
or not at all.

Add an adapter from INameResolver to IResolverSimple to preserve

backwards compatibility of reactor.resolver

So maybe you meant

def installResolver(self, resolver):
    self._resolver = INameResolver(resolver)

reactor.resolver then becomes a descriptor which issues a deprecation
warning and returns IResolverSimple(reactor._resolver)

Think about a new reactor name resolution interface (and think about

not providing one).

So reactor.resolve must continue to return an IP address (as per IReactorCore)

In which case why bother changing any of the reactor code in this
branch.

Why can't this branch simply be about adding INameResolver,
ThreadedNameReactor, and optionally the deprecations below...

Then decide on the reactor interface based on the usage of getaddrinfo
in #4859.

  • Deprecate IReactorCore.resolve
  • Deprecate reactor.resolver

At which point I think I understand why you paused work on this branch
in the first place :)

Anyway. It's been interesting to learn a little more about the gai
etc.

If you have any time to comment I'd be interested hear what you think.

-RichardW.

comment:16 Changed 19 months ago by twm

  • Cc tom.most@… added

comment:17 in reply to: ↑ 15 Changed 13 months ago by thijs

  • Cc thijs added
  • Keywords review added
  • Owner set to exarkun

Replying to rwall:

Hi exarkun,

I've done a bit more work on this ticket but now I need some
guidance.

I have some questions about your other points.
If you have any time to comment I'd be interested hear what you think.

-RichardW.

Assigning it to exarkun for rwall's questions.

comment:18 in reply to: ↑ 15 Changed 8 months ago by glyph

  • Keywords review removed
  • Owner changed from exarkun to rwall

Replying to rwall:

Hi exarkun,

I'll be filling in for exarkun today.

I've done a bit more work on this ticket but now I need some
guidance.

Thanks!

First here's what I've done on top of you original branch (in
log:branches/getaddrinfo-4362-2)

  1. Merged forward resolving some conflicts
  2. Added ThreadedNameResolver + tests
  3. Added a timeout parameter to INameResolver.getAddressInformation
  4. Registered an IResolverSimple to INameResolver adapter.

Great.

I think that covers points 1 and 2 in your comment:2 (above)

I have some questions about your other points.

Make reactor.resolver be an IResolverSimple again

Is this for backwards compatibility?

Yes. The backwards compatibility contract here is that the resolver attribute is an IResolverSimple.

So instead of adapting the resolver in reactor.installResolver...
we would do the adaption inside reactor.resolve everytime it's
called.
or not at all.

Well... you don't need to do that. Just because resolver is a public attribute does not mean you need to consult it each time you do a resolution. You can do the adaptation in installResolver as long as the new attribute you're storing is under a different name. You may even want to make the new attribute public, if there are reasons that callers would want to access it directly.

Add an adapter from INameResolver to IResolverSimple to preserve

backwards compatibility of reactor.resolver

So maybe you meant
...
reactor.resolver then becomes a descriptor which issues a deprecation
warning and returns IResolverSimple(reactor._resolver)

Right, exactly.

Think about a new reactor name resolution interface (and think about

not providing one).

So reactor.resolve must continue to return an IP address (as per IReactorCore)

Yup, them's the breaks. It has a contract, it can't change the contract.

In which case why bother changing any of the reactor code in this branch.

Why can't this branch simply be about adding INameResolver,
ThreadedNameReactor, and optionally the deprecations below...

I'm definitely in favor of smaller branches.

Then decide on the reactor interface based on the usage of getaddrinfo
in #4859.

  • Deprecate IReactorCore.resolve
  • Deprecate reactor.resolver

At which point I think I understand why you paused work on this branch
in the first place :)

Anyway. It's been interesting to learn a little more about the gai
etc.

If you have any time to comment I'd be interested hear what you think.

Did that answer most of your questions about this?

Note: See TracTickets for help on using tickets.