Opened 7 years ago

Last modified 3 years ago

#2790 defect assigned

UDP Transport write() raises socket.error EWOULDBLOCK

Reported by: Peaker Owned by: glyph
Priority: normal Milestone:
Component: core Keywords:
Cc: Branch: branches/udp-wouldblock-2790
(diff, github, buildbot, log)
Author: exarkun Launchpad Bug:

Description

I am running Twisted 2.5 on Windows XP.

This is my traceback:

  File "twisted\internet\udp.py", line 169, in write
  File "twisted\internet\udp.py", line 156, in write
socket.error: (10035, 'The socket operation could not complete without blocking')

And:

>>> errno.errorcode[10035]
'WSAEWOULDBLOCK'

I believe it occurs because the socket buffer is full. It began happening when I started sending many datagrams on the UDP socket at the same time (I believed it would queue them).

Twisted transports queue their writes in general, so it would make sense to either queue the data, lose it (this would be "right" as UDP does not guarantee arrival, but quite nasty), or at least document the raised error in that transport.

Attachments (2)

udp-wouldblock-2790-1.patch (5.4 KB) - added by ghazel 4 years ago.
patch with fix and unit tests
udp-wouldblock-2790-2.patch (5.8 KB) - added by ghazel 4 years ago.
patch with fix and unit tests

Download all attachments as: .zip

Change History (25)

comment:1 Changed 7 years ago by Peaker

After a bit more research, I have verified that on both Linux and Windows (The Linux behavior is a reliable IIRC :-) you get EAGAIN/EWOULDBLOCK in sendto calls, when the buffers are full.

This situation is not easily reachable with localhost, because there the send occurs immediately or almost immediately, so its hard to utilize the send buffers.

It is possible to simulate this situation by simply using sendto that will use a network interface that is slower than the rate of data being sent.

udp transport's write can just swallow the exception, however I believe that in many (most?) networks, as long as you send data below the network capacity, packet loss is quite rare, so when a packet is *known* to be lost, it should notify the application, so it can take the appropriate measures rather than losing this information.

Also, knowing about the loss of these packets is a simple mechanism to know the rate at which the network interface (first hop) can send data. In some applications, this first hop is either the only hop, or the narrowest pipe in the route. In these cases, using the sendto EAGAIN/EWOULDBLOCK and the reactor writable event on the socket, an application can implement a very simple and effective flow control that fully utilizes the route without receiver feedback.

Note that if notification about sendto failures exist, that notifications of doWrite (writability) to the application should also be implemented somehow.

comment:2 Changed 7 years ago by glyph

There are two separate issues being discussed here. One is a feature request, for a structured way of notifying the application of detectable packet loss. The other is a bug, for dealing properly with the case where the send buffer is full.

write() is not documented to raise any exceptions, so it shouldn't. Let's fix that first, and make that the scope of this ticket. Later, we might also want to add an API for notifying application code about packet loss. Perhaps we could extend or revise the somewhat dodgy "connectionRefused" method. But I'll talk about that further on another ticket that you open, and not here :).

comment:3 follow-up: Changed 6 years ago by exarkun

#3364 was basically a duplicate of this. Note, however, that it was EAGAIN instead of EWOULDBLOCK. We should handle each of these in this code.

comment:4 Changed 6 years ago by exarkun

  • author set to exarkun
  • Branch set to branches/udp-wouldblock-2790

(In [24333]) Branching to 'udp-wouldblock-2790'

comment:5 Changed 6 years ago by glyph

  • Owner changed from glyph to exarkun

comment:6 in reply to: ↑ 3 ; follow-up: Changed 6 years ago by glyph

Replying to exarkun:

#3364 was basically a duplicate of this. Note, however, that it was EAGAIN instead of EWOULDBLOCK. We should handle each of these in this code.

Actually it said "socket error 11" which is both EAGAIN and EWOULDBLOCK on linux :). On FreeBSD and OS X they're also the same (although they're 35 instead of 11).

Still, we need to handle EAGAIN and EWOULDBLOCK separately because they're different on Windows.

comment:7 in reply to: ↑ 6 Changed 4 years ago by ghazel

Replying to glyph:

Replying to exarkun:

#3364 was basically a duplicate of this. Note, however, that it was EAGAIN instead of EWOULDBLOCK. We should handle each of these in this code.

Actually it said "socket error 11" which is both EAGAIN and EWOULDBLOCK on linux :). On FreeBSD and OS X they're also the same (although they're 35 instead of 11).

Still, we need to handle EAGAIN and EWOULDBLOCK separately because they're different on Windows.

This code from udp.py disagrees with you:

if platformType == 'win32':
    from errno import WSAEWOULDBLOCK as EWOULDBLOCK
    ....
    EAGAIN=EWOULDBLOCK

So, they're both 10035 on Windows.

Changed 4 years ago by ghazel

patch with fix and unit tests

comment:8 Changed 4 years ago by ghazel

  • Keywords review added
  • Owner exarkun deleted

This patch fixes this the easy way, by dropping packets when the buffer is full. It also includes tests to test the behavior of sendto and the effectiveness of the fix.

I can't merge this easily because Combinator is horribly broken on Windows, so if someone could help me out with that I would appreciate it.

comment:9 Changed 4 years ago by exarkun

  • Branch changed from branches/udp-wouldblock-2790 to branches/udp-ewouldblock-2790

(In [29318]) Branching to 'udp-ewouldblock-2790'

comment:10 Changed 4 years ago by exarkun

(In [29319]) Apply udp-wouldblock-2790-1.patch

refs #2790

comment:11 Changed 4 years ago by exarkun

  • Keywords review removed
  • Owner set to ghazel

Thanks for working on this. It'd be nice if, when submitting a brand new approach to a ticket which already has a proposed solution (ie, the previous branch linked from this ticket), you could describe why the old approach should be abandoned. This way the reviewer doesn't have to re-discover this reason on their own. In this case, I think the answer is that a different testing strategy is being preferred? The old test used whatever reactor was in use, but the new one always tests the posix-y udp.Port. Presumably this is good because it makes trial --reactor iocp twisted.test.test_udp work. What about the IOCP behavior of this case?

  1. The errno module is unused in test_udp_internals.py.
  2. test_sendToWouldBlock talks about EMFILE and ENOBUFS.
  3. There's a space missing in the skip message for that test method.
  4. That test method sorta leaks-ish a couple sockets. Not really, I suppose, they'll get closed when the garbage collector eventually finds them. But that might not be fast enough to have trial -u ... work indefinitely.
  5. I'm not sure the skips set at the end of test_udp_internals.py are correct. The tests apply to twisted.internet.udp.Port, regardless of the features of the global reactor. The same probably applies to test_tcp_internals.py, which I guess is where you got the idea.
  6. I forced builds on the branch. One Windows builder failed the new test. :/

Changed 4 years ago by ghazel

patch with fix and unit tests

comment:12 Changed 4 years ago by ghazel

Oh, I couldn't find a way to see if the previous branch mentioned above had any changes at all related to this ticket. To me it looked like a branch which was started with no patches or changes applied, but I see now that is due to Trac not being good. The test_udp_internals.py file was just a copy of test_tcp_internals.py, which is the reason for the odd reactor behavior and other EMFILE mistakes.

  1. Fixed.
  2. Fixed.
  3. Fixed.
  4. Fixed.
  5. Yup. I'm not sure why it's written that way in test_tcp_internals.
  6. Ok, so that basically invalidates testing the behavior of the native sento.

So, looking at your original changes in http://twistedmatrix.com/trac/changeset/24334/branches/udp-wouldblock-2790

I'm satisfied that handles this ticket well enough. The change to the code is effectively the same, and the tests don't try to test the behavior of the native sento, so they don't fail either.

Can we just use that?

comment:13 Changed 4 years ago by bj0

I think I'm getting this bug or something related to it. I tried playing with the example udp client/server from this thread: http://bytes.com/topic/python/answers/855870-problem-writing-fast-udp-server

And it works fine if both client and server are running on the same host, but if you edit it so that they are on different hosts, I get the same error as the last post on that thread.

the stacktrace is:

Unhandled error in Deferred:
Traceback (most recent call last):

File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1170, in run

self.mainLoop()

File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1179, in mainLoop

self.runUntilCurrent()

File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 778, in runUntilCurrent

call.func(*call.args, call.kw)

File "/usr/lib/python2.6/dist-packages/twisted/internet/task.py", line 194, in call

d = defer.maybeDeferred(self.f, *self.a, self.kw)

--- <exception caught here> ---

File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 117, in maybeDeferred

result = f(*args, kw)

File "./udpclient.py", line 16, in sendDatagrams

self.transport.write(msg,('10.10.10.11',8012))

File "/usr/lib/python2.6/dist-packages/twisted/internet/udp.py", line 150, in write

return self.socket.sendto(datagram, addr)

socket.error: [Errno 11] Resource temporarily unavailable

comment:14 Changed 4 years ago by ghazel

  • Branch changed from branches/udp-ewouldblock-2790 to branches/udp-wouldblock-2790
  • Keywords review added

Disregard the attached files, let's use branches/udp-wouldblock-2790

comment:15 Changed 4 years ago by ghazel

  • Owner changed from ghazel to exarkun

I'm not sure how review on this one should work. I think branches/udp-wouldblock-2790 looks good, but I didn't write it, exarkun did. So are we done?

comment:16 Changed 4 years ago by exarkun

  • Owner exarkun deleted

Since you're not the author, you can be a review. So if you think it looks good, then that might mean we should merge it. However, I'm still not completely convinced by the unit tests.

One concern is that we should be testing both the iocp and posixbase reactor implementations of UDP. Both proposed branches only exercise the posixbase implementation.

Another concern is that ultimately I think we want all of our direct reactor-behavior tests to be written in the ReactorBuilder style so they don't rely on lots of builders with different --reactor parameters being passed to trial.

Maybe the current unit tests are good enough to demonstrate the fix is correct and we can improve them later, though.

comment:17 Changed 4 years ago by ghazel

  • Keywords review removed

Wow, nevermind. Bugs.

1) I tried to run your tests on Windows, and got "[Errno 10049] The requested address is not valid in its context", because you're trying to send to ('0.0.0.0', 61207). Setting the interface specifically to 127.0.0.1 in the listenUDP (similar to my tests) fixed it.

2) In twisted/internet/udp.py the value of EAGAIN is explicitly set to be the same as EWOULDBLOCK on Windows. However the actual value of EAGAIN is 11, while the value of EWOULDBLOCK is 10035. That means your tests for EAGAIN fail, since the value of 11 is not caught. I'm not sure if the bug is in udp.py's lies about values, or where you get your errno constants from in the tests.

3) The unittests do run with iocpreactor if you use --reactor iocp, but they fail because there is no handling of EAGAIN and EWOULDBLOCK. It looks like iocpreactor juts copied the sendto code, so copying the fix to iocp should be trivial.

I also got some "Reactor was unclean" errors, but I believe that is due to _testCaseInstance in unittest.py which is long gone in trunk.

comment:18 Changed 4 years ago by ghazel

  • Owner set to exarkun

comment:19 Changed 4 years ago by <automation>

  • Owner exarkun deleted

comment:20 Changed 3 years ago by glyph

I was recently embarrassed to re-discover this bug while answering a question on Stack Overflow.

comment:21 Changed 3 years ago by glyph

#2513 is also related to deficiencies in udp.Port.write's implementation.

comment:22 Changed 3 years ago by glyph

  • Owner set to glyph
  • Status changed from new to assigned

Somebody remind me about this at the next sprint.

comment:23 Changed 3 years ago by itamar

MSDN documentation lists a whole bunch more errors we should be handling in addition to WSAEWOULDBLOCK: http://msdn.microsoft.com/en-us/library/windows/desktop/ms740148%28v=vs.85%29.aspx

We should probably do something for writing similar to what #3396 is doing for reading.

Note: See TracTickets for help on using tickets.