Ticket #2022 (closed defect: fixed )

Opened 3 years ago

Last modified 3 years ago

ConnectionRefusedError's errno and strerror surprising on BSD/OSX

Reported by: dalke Assigned to: exarkun
Type: defect Priority: highest
Milestone: Component: core
Keywords: Cc: spiv, exarkun, jerub
Branch: Author:
Launchpad Bug:

Description (last modified by exarkun)

When a non-blocking connect fails on BSD, the attempt sets errno to EINVAL. Twisted currently handles this, turning that errno into a ConnectionRefusedError instance and passing that on to application-code. However, it preserves the exact errno reported by the operating system. For people accustomed to the Linux behavior, BSD's behavior seems surprising and less useful.

At the very least, this platform inconsistency should be documented so that users know to rely on the exception type rather than the errno itself.

I can't think of any code changes which make sense here. In particular, the platform errno must be preserved exactly, not mangled into something less surprising.

Attachments

Change History

  2006-08-29 00:34:43+00:00 changed by spiv

  • cc set to spiv

  2006-08-29 04:28:43+00:00 changed by jerub

  • owner changed from glyph to jerub

I'll take this ticket, I've got an OSX machine to hack with.

This is the smallest testcase i could come up with.

import socket, errno, os
addr = ('127.0.0.1', 8081)
s = socket.socket()
s.setblocking(0)
while 1:
    ret = s.connect_ex(addr)
    if ret in (errno.EINPROGRESS, errno.EALREADY):
        continue
    assert ret == errno.ECONNREFUSED, "Ret was %d:%s" % (ret, os.strerror(ret))
    break
print 'Success'
OSX$ python foo.py
Traceback (most recent call last):
  File "foo.py", line 12, in ?
    assert ret == errno.ECONNREFUSED, "Ret was %d:%s" % (ret, os.strerror(ret))
AssertionError: Ret was 22:Invalid argument
Linux$ python foo.py
Success

  2006-08-29 06:35:56+00:00 changed by jerub

Apparently smart people have seen this before: http://cr.yp.to/docs/connect.html

  2006-08-29 13:33:02+00:00 changed by exarkun

  • cc changed from spiv to spiv, exarkun, jerub
  • keywords set to documentation
  • description deleted
  • summary changed from connection refused error sets errno to EINVAL instead of ECONNREFUSED to ConnectionRefusedError's errno and strerror surprising on BSD/OSX

Previously the description of this ticket was:

(with minor edits)

<dalke> Hi all.  I was trying to track down an unexpected error value in Twisted
          last night.
<dalke> When I use the http_client code to connect to a site which isn't there I
          expected to get an error code of ECONNREFUSED
<_moshez> why did you expect such a thing
<_moshez> or rather, what do you mean by "isn't there"
<dalke> But the Twisted client code return an error object of
          ConnectionRefusedError but with error code of EINVAL
<_moshez> no DNS? no such IP? IP exists, fw stops? IP exists, nothing listening?
<dalke> No server was running.  Nothing was on the port.
<_moshez> strange then :(
<dalke> localhost, port 8081
<dalke> I looked around and others report the same error code, eg
<dalke> http://twistedmatrix.com/pipermail/twisted-python/2003-August/005396.html
<dalke> http://archives.free.net.ph/message/20031215.101444.41b90ae4.en.html
<dalke> both have "ConnectionRefusedError" and "22: Invalid argument"
<dalke> The error handling is in internet.errors where there's a lookup table
          mapping
<dalke>         # for FreeBSD - might make other unices in certain cases
<dalke>         # return wrong exception, alas
<dalke>         errno.EINVAL: ConnectionRefusedError,
<dalke> I'm in OS X.
<dalke> When I tried the connect_ex call by hand (the source of the error is from
          internet/tcp.py) I get the expected ECONNREFUSED
<dalke> I was trying out both Twisted and Allegra and Allegra also reports the
          expected ECONNREFUSED
<dalke> I tried following the code through but couldn't figure out the problem.
<idnar> dalke: what platform are you on?
<dalke> OS X
<idnar> do you have a minimal reproduction script?
<dalke> The URLs I give above show reproducible on HP-UX.
<dalke> from twisted.internet import reactor
<dalke> from twisted.web import client
<dalke> def handleCallback(response):
<dalke>     print response
<dalke>     reactor.stop()
<dalke> def handleErrback(err):
<dalke>     print "Error:", err
<dalke>     reactor.stop()
<dalke> get_page = client.getPage("http://localhost:8081/")
<dalke> get_page.addCallbacks(handleCallback, handleErrback)
<dalke> reactor.run()
<dalke> That's the reproducible.  For minimal you don't need the success callback.
<idnar> pastebin would have been better; but let me try that locally
<idnar> anyhow, I get ECONNREFUSED running that script on my Linux box
<dalke> Or you could go
          http://dalkescientific.com/writings/diary/archive/2006/08/28/levels_of_abstraction.html
<idnar> twisted.internet.error.ConnectionRefusedError: Connection was refused by
          other side: 111: Connection refused.
<dalke> Oh, and this is CVS version of Twisted from two days ago.
<dalke> Err, "svn"
<dalke> Might be a BSD vs. SysV thing.
<idnar> *nod* I'm also running recent svn trunk
<idnar> so I guess there's some platform-specific weirdness at work here
<dalke> There could also be a timing issue.
<exarkun> dalke: Does the errno actually matter, given that you are being given a
          ConnectionRefusedError exception instance?
<dalke> I tracked two calls to connect_ex, first returning EINPROGRESS then
          returning EINVAL
<exarkun> isn't that enough to determine that there was no server listening?
<dalke> What error message do I show to a user and how do I localize the message?
<dalke> There are two text strings now, one from the docstring of the class and the
          other from the errno.
<exarkun> dalke: Ah, I see
<dalke> The errno of "invalid parameter" suggests to the user that there was a
          programming error or a typo.
<dalke> The one from the docstring is right, but it's not the normal posix one.
<dalke> I also see EINVAL returns that error class when there are potentially other
          reasons to return EINVAL which don't come from "connection refused"
<exarkun> I guess there should be more platform special-casing
<dalke> I don't know network programming anywhere near well enough to offer a
          useful response.
<exarkun> Could you file a bug report at <http://twistedmatrix.com/>?  I might even
          have an OS X machine to do some testing on soon.

Which doesn't quite make clear the fact that Twisted already correctly handles EINVAL on BSD but is missing another feature.

  2006-08-30 16:46:46+00:00 changed by jknight

Just to repeat things discussed in IRC: The error is calling connect() again and expecting that to return the correct error. EINVAL isn't the right errno to be returning -- that's not the original error. That's the error for calling connect a second time on an errorful socket.

We should not be repeatedly calling connect(). The link above shows better things to do, of which two seem promising: getsockopt(.., SO_ERROR), or getpeername(). They should be tested on the supported platforms to see if they function as expected.

  2006-09-05 02:02:08+00:00 changed by exarkun

  • keywords changed from documentation to review
  • owner deleted
  • priority changed from normal to highest

Ready for review in einval-osx-2022. As James mentioned above, the approach taken is to check SO_ERROR instead of relying on a second connect attempt to report a sensible error.

  2006-09-06 11:02:09+00:00 changed by jerub

  • keywords deleted
  • owner set to exarkun

The test is really quite complicated. Why involve a random number generator in something that should be completely deterministic? Have you experienced test failures where this approach (retrying up to 10 times to get the correct behaviour) has been required?

The changes apart from the tests look good.

Provided there's a good reason that the test looks the way it does (I seem to recall Exarkun mentioning non-deterministic tests on irc), please merge, otherwise please simplify it so that it doesn't deal a deck of 10 random modules.. Please note why test_connectionRefusedErrorNumber is like it is in the docstring.

  2006-09-06 18:04:00+00:00 changed by exarkun

  • status changed from new to closed
  • resolution set to fixed

(In [18064]) Merge einval-osx-2022

Author: exarkun Reviewer: jerub Fixes #2022

This changes the TCP connection error checking to use getsockopt(SOL_SOCKET, SO_ERROR) to discover connection failure errnos, rather than relying on the errno set by a second call to connect().

Note: See TracTickets for help on using tickets.