[Twisted-web] web2: reverse dns lookup in twisted reactor thread

glyph at divmod.com glyph at divmod.com
Tue Jul 11 17:53:01 CDT 2006


On Tue, 11 Jul 2006 17:14:58 -0400, glyph at divmod.com wrote:

D'oh.  Hit 'send' too hastily.

>- There's no docstring explaining why the cache is necessary, e.g. why your 
>local caching nameserver isn't good enough.

I want to make it clear that what I'm complaining about here is not that it shouldn't exist - I understand that is trying to not block on repetitive calls - but that this should be explained in a docstring.  Private methods should have maintainer-friendly docstrings.

>- It sets the "host" attribute on the transport, which is not documented by 
>ITransport.

It does this on the address, which _is_ documented.  However, it's documented to be "a dotted-quad IP address".  Something expecting the host to be a name requiring no lookup might pass it to a connection function where it would otherwise have an explicit name-lookup step and that would be bad.

>- As you've noticed, it uses gethostbyaddr rather than the reactor's 
>resolver.  If it really needs to use a hostname, and not a numeric address, 
>it's going to need to return a Deferred.

Also, to clarify: a blocking call like this *might* be fine somewhere that it would definitely be a one-time operation; however, it's exposed to user code, which means that someone expecting a synchronous operation might call it, and then in the case of e.g. a nameserver failure it would block (potentially for a long time).

However, getting the right hostname is a confusing and error-prone activity.  As a data point, this call takes about 30 seconds on my current nameserver, and ends up getting the wrong information anyway (it should be using my NAT-external address, but instead it ends up using my internal address).  It's also made in the server code unconditionally: the host address information is used only in the path where there is no Host: header but we block anyway.  If there is a transient DNS failure, we cache the numeric host name for a process lifetime (e.g. forever) anyway.

Given that I think it would be best to at least be consistent, and always use the numeric in the case where we don't know.



More information about the Twisted-web mailing list