Opened 11 years ago
Last modified 21 months ago
#2790 defect assigned
UDP Transport write() raises socket.error EWOULDBLOCK
Reported by: | Peaker | Owned by: | Glyph |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | core | Keywords: | |
Cc: | i@… | Branch: |
branches/udp-wouldblock-2790
branch-diff, diff-cov, branch-cov, buildbot |
Author: | exarkun |
Description
I am running Twisted 2.5 on Windows XP.
This is my traceback:
File "twisted\internet\udp.py", line 169, in write File "twisted\internet\udp.py", line 156, in write socket.error: (10035, 'The socket operation could not complete without blocking')
And:
>>> errno.errorcode[10035] 'WSAEWOULDBLOCK'
I believe it occurs because the socket buffer is full. It began happening when I started sending many datagrams on the UDP socket at the same time (I believed it would queue them).
Twisted transports queue their writes in general, so it would make sense to either queue the data, lose it (this would be "right" as UDP does not guarantee arrival, but quite nasty), or at least document the raised error in that transport.
Attachments (2)
Change History (26)
comment:1 Changed 11 years ago by
comment:2 Changed 11 years ago by
There are two separate issues being discussed here. One is a feature request, for a structured way of notifying the application of detectable packet loss. The other is a bug, for dealing properly with the case where the send buffer is full.
write()
is not documented to raise any exceptions, so it shouldn't. Let's fix that first, and make that the scope of this ticket. Later, we might also want to add an API for notifying application code about packet loss. Perhaps we could extend or revise the somewhat dodgy "connectionRefused" method. But I'll talk about that further on another ticket that you open, and not here :).
comment:3 follow-up: 6 Changed 10 years ago by
#3364 was basically a duplicate of this. Note, however, that it was EAGAIN
instead of EWOULDBLOCK
. We should handle each of these in this code.
comment:4 Changed 10 years ago by
author: | → exarkun |
---|---|
Branch: | → branches/udp-wouldblock-2790 |
(In [24333]) Branching to 'udp-wouldblock-2790'
comment:5 Changed 9 years ago by
Owner: | changed from Glyph to Jean-Paul Calderone |
---|
comment:6 follow-up: 7 Changed 9 years ago by
Replying to exarkun:
#3364 was basically a duplicate of this. Note, however, that it was
EAGAIN
instead ofEWOULDBLOCK
. We should handle each of these in this code.
Actually it said "socket error 11" which is both EAGAIN and EWOULDBLOCK on linux :). On FreeBSD and OS X they're also the same (although they're 35 instead of 11).
Still, we need to handle EAGAIN
and EWOULDBLOCK
separately because they're different on Windows.
comment:7 Changed 8 years ago by
Replying to glyph:
Replying to exarkun:
#3364 was basically a duplicate of this. Note, however, that it was
EAGAIN
instead ofEWOULDBLOCK
. We should handle each of these in this code.Actually it said "socket error 11" which is both EAGAIN and EWOULDBLOCK on linux :). On FreeBSD and OS X they're also the same (although they're 35 instead of 11).
Still, we need to handle
EAGAIN
andEWOULDBLOCK
separately because they're different on Windows.
This code from udp.py disagrees with you:
if platformType == 'win32': from errno import WSAEWOULDBLOCK as EWOULDBLOCK .... EAGAIN=EWOULDBLOCK
So, they're both 10035 on Windows.
comment:8 Changed 8 years ago by
Keywords: | review added |
---|---|
Owner: | Jean-Paul Calderone deleted |
This patch fixes this the easy way, by dropping packets when the buffer is full. It also includes tests to test the behavior of sendto and the effectiveness of the fix.
I can't merge this easily because Combinator is horribly broken on Windows, so if someone could help me out with that I would appreciate it.
comment:9 Changed 8 years ago by
Branch: | branches/udp-wouldblock-2790 → branches/udp-ewouldblock-2790 |
---|
(In [29318]) Branching to 'udp-ewouldblock-2790'
comment:11 Changed 8 years ago by
Keywords: | review removed |
---|---|
Owner: | set to ghazel |
Thanks for working on this. It'd be nice if, when submitting a brand new approach to a ticket which already has a proposed solution (ie, the previous branch linked from this ticket), you could describe why the old approach should be abandoned. This way the reviewer doesn't have to re-discover this reason on their own. In this case, I think the answer is that a different testing strategy is being preferred? The old test used whatever reactor was in use, but the new one always tests the posix-y udp.Port
. Presumably this is good because it makes trial --reactor iocp twisted.test.test_udp
work. What about the IOCP behavior of this case?
- The
errno
module is unused intest_udp_internals.py
. test_sendToWouldBlock
talks aboutEMFILE
andENOBUFS
.- There's a space missing in the skip message for that test method.
- That test method sorta leaks-ish a couple sockets. Not really, I suppose, they'll get closed when the garbage collector eventually finds them. But that might not be fast enough to have
trial -u ...
work indefinitely. - I'm not sure the skips set at the end of
test_udp_internals.py
are correct. The tests apply totwisted.internet.udp.Port
, regardless of the features of the global reactor. The same probably applies totest_tcp_internals.py
, which I guess is where you got the idea. - I forced builds on the branch. One Windows builder failed the new test. :/
comment:12 Changed 8 years ago by
Oh, I couldn't find a way to see if the previous branch mentioned above had any changes at all related to this ticket. To me it looked like a branch which was started with no patches or changes applied, but I see now that is due to Trac not being good. The test_udp_internals.py file was just a copy of test_tcp_internals.py, which is the reason for the odd reactor behavior and other EMFILE mistakes.
- Fixed.
- Fixed.
- Fixed.
- Fixed.
- Yup. I'm not sure why it's written that way in test_tcp_internals.
- Ok, so that basically invalidates testing the behavior of the native sento.
So, looking at your original changes in http://twistedmatrix.com/trac/changeset/24334/branches/udp-wouldblock-2790
I'm satisfied that handles this ticket well enough. The change to the code is effectively the same, and the tests don't try to test the behavior of the native sento, so they don't fail either.
Can we just use that?
comment:13 Changed 8 years ago by
I think I'm getting this bug or something related to it. I tried playing with the example udp client/server from this thread: http://bytes.com/topic/python/answers/855870-problem-writing-fast-udp-server
And it works fine if both client and server are running on the same host, but if you edit it so that they are on different hosts, I get the same error as the last post on that thread.
the stacktrace is:
Unhandled error in Deferred: Traceback (most recent call last):
File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1170, in run
self.mainLoop()
File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1179, in mainLoop
self.runUntilCurrent()
File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 778, in runUntilCurrent
call.func(*call.args, call.kw)
File "/usr/lib/python2.6/dist-packages/twisted/internet/task.py", line 194, in call
d = defer.maybeDeferred(self.f, *self.a, self.kw)
--- <exception caught here> ---
File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 117, in maybeDeferred
result = f(*args, kw)
File "./udpclient.py", line 16, in sendDatagrams
self.transport.write(msg,('10.10.10.11',8012))
File "/usr/lib/python2.6/dist-packages/twisted/internet/udp.py", line 150, in write
return self.socket.sendto(datagram, addr)
socket.error: [Errno 11] Resource temporarily unavailable
comment:14 Changed 8 years ago by
Branch: | branches/udp-ewouldblock-2790 → branches/udp-wouldblock-2790 |
---|---|
Keywords: | review added |
Disregard the attached files, let's use branches/udp-wouldblock-2790
comment:15 Changed 8 years ago by
Owner: | changed from ghazel to Jean-Paul Calderone |
---|
I'm not sure how review on this one should work. I think branches/udp-wouldblock-2790 looks good, but I didn't write it, exarkun did. So are we done?
comment:16 Changed 8 years ago by
Owner: | Jean-Paul Calderone deleted |
---|
Since you're not the author, you can be a review. So if you think it looks good, then that might mean we should merge it. However, I'm still not completely convinced by the unit tests.
One concern is that we should be testing both the iocp and posixbase reactor implementations of UDP. Both proposed branches only exercise the posixbase implementation.
Another concern is that ultimately I think we want all of our direct reactor-behavior tests to be written in the ReactorBuilder
style so they don't rely on lots of builders with different --reactor
parameters being passed to trial.
Maybe the current unit tests are good enough to demonstrate the fix is correct and we can improve them later, though.
comment:17 Changed 8 years ago by
Keywords: | review removed |
---|
Wow, nevermind. Bugs.
1) I tried to run your tests on Windows, and got "[Errno 10049] The requested address is not valid in its context", because you're trying to send to ('0.0.0.0', 61207). Setting the interface specifically to 127.0.0.1 in the listenUDP (similar to my tests) fixed it.
2) In twisted/internet/udp.py the value of EAGAIN is explicitly set to be the same as EWOULDBLOCK on Windows. However the actual value of EAGAIN is 11, while the value of EWOULDBLOCK is 10035. That means your tests for EAGAIN fail, since the value of 11 is not caught. I'm not sure if the bug is in udp.py's lies about values, or where you get your errno constants from in the tests.
3) The unittests do run with iocpreactor if you use --reactor iocp, but they fail because there is no handling of EAGAIN and EWOULDBLOCK. It looks like iocpreactor juts copied the sendto code, so copying the fix to iocp should be trivial.
I also got some "Reactor was unclean" errors, but I believe that is due to _testCaseInstance in unittest.py which is long gone in trunk.
comment:18 Changed 8 years ago by
Owner: | set to Jean-Paul Calderone |
---|
comment:19 Changed 7 years ago by
Owner: | Jean-Paul Calderone deleted |
---|
comment:20 Changed 6 years ago by
I was recently embarrassed to re-discover this bug while answering a question on Stack Overflow.
comment:21 Changed 6 years ago by
#2513 is also related to deficiencies in udp.Port.write
's implementation.
comment:22 Changed 6 years ago by
Owner: | set to Glyph |
---|---|
Status: | new → assigned |
Somebody remind me about this at the next sprint.
comment:23 Changed 6 years ago by
MSDN documentation lists a whole bunch more errors we should be handling in addition to WSAEWOULDBLOCK: http://msdn.microsoft.com/en-us/library/windows/desktop/ms740148%28v=vs.85%29.aspx
We should probably do something for writing similar to what #3396 is doing for reading.
comment:24 Changed 21 months ago by
Cc: | i@… added |
---|
After a bit more research, I have verified that on both Linux and Windows (The Linux behavior is a reliable IIRC :-) you get EAGAIN/EWOULDBLOCK in sendto calls, when the buffers are full.
This situation is not easily reachable with localhost, because there the send occurs immediately or almost immediately, so its hard to utilize the send buffers.
It is possible to simulate this situation by simply using sendto that will use a network interface that is slower than the rate of data being sent.
udp transport's write can just swallow the exception, however I believe that in many (most?) networks, as long as you send data below the network capacity, packet loss is quite rare, so when a packet is *known* to be lost, it should notify the application, so it can take the appropriate measures rather than losing this information.
Also, knowing about the loss of these packets is a simple mechanism to know the rate at which the network interface (first hop) can send data. In some applications, this first hop is either the only hop, or the narrowest pipe in the route. In these cases, using the sendto EAGAIN/EWOULDBLOCK and the reactor writable event on the socket, an application can implement a very simple and effective flow control that fully utilizes the route without receiver feedback.
Note that if notification about sendto failures exist, that notifications of doWrite (writability) to the application should also be implemented somehow.