Ticket #5365 defect new

Opened 2 years ago

Last modified 11 months ago

_pollingfile makes wrong assumption about WriteFile

Reported by: antoine Owned by: johnnypops
Priority: normal Milestone:
Component: core Keywords:
Cc: solipsis@… Branch: branches/win32-WriteFile-buffer-5365
(diff, github, buildbot, log)
Author: tomprince Launchpad Bug:

Description

_pollingfile assumes that (win32 API) WriteFile() will allow a partial write, but WriteFile() writes either all or nothing. Which means if you try to write too much at once in a pipe's buffer, the _PollableWritePipe sits there doing nothing (in a polling loop :-)).

test_stdio.StandardInputOutputTestCase.test_consumer conveniently tries to consume less than 4096 bytes, which is apparently the default buffer size of Windows anonymous pipes. With the attached patch growing the total transfer size in that test case, test_stdio.StandardInputOutputTestCase.test_consumer hangs here with Python 2.7 under 64-bit Windows 7.

Attachments

pollingfilebug.patch Download (454 bytes) - added by antoine 2 years ago.
pollingfilesolution.patch Download (1.3 KB) - added by antoine 2 years ago.
pollingfilesolution2.patch Download (2.3 KB) - added by antoine 2 years ago.
test_pollingfile-5365.patch Download (7.3 KB) - added by johnnypops 12 months ago.
pollingfile-pipe-buffer-size.patch Download (5.8 KB) - added by johnnypops 12 months ago.

Change History

Changed 2 years ago by antoine

Changed 2 years ago by antoine

1

  Changed 2 years ago by antoine

  • cc solipsis@… added
  • keywords review added

And here is a possible solution.

2

  Changed 2 years ago by itamar

Is the block size *always* 4096? Is there any way to introspect and figure it out? If we hardcode one number and it turns out to be another we're back to where we started.

3

  Changed 2 years ago by antoine

I honestly don't know. I got it through trial and error. However, it's not a problem if it's above 4096 (except that things might be slightly less efficient than ideal, but the whole _pollingfile thing is kind of a kludge anyway). The only problem is if it's smaller than 4096, but that would seem a really petty buffer size from Microsoft :)

4

  Changed 2 years ago by antoine

Oh, CreatePipe() has an optional buffer size parameter but:

The size is only a suggestion; the system uses the value to calculate an appropriate buffering mechanism. If this parameter is zero, the system uses the default buffer size.

 http://msdn.microsoft.com/en-us/library/windows/desktop/aa365152%28v=vs.85%29.aspx

5

  Changed 2 years ago by antoine

  • cc solipsis@… removed

Here is a new patch which enforces, as far as possible, the expected buffer size.

Changed 2 years ago by antoine

6

  Changed 2 years ago by exarkun

 http://msdn.microsoft.com/en-us/library/windows/desktop/aa365605(v=vs.85).aspx has this to say about the "wait mode":

A WriteFile operation is affected by the wait mode of a pipe handle when there is insufficient space in the pipe's buffer. With a blocking-wait handle, the write operation cannot succeed until sufficient space is created in the buffer by a thread reading from the other end of the pipe. With a nonblocking-wait handle, the write operation returns a nonzero value immediately, without writing any bytes (for a message-type pipe) or after writing as many bytes as the buffer holds (for a byte-type pipe).

Isn't this saying that partial writes do happen? Does the implementation differ from the documentation?

7

  Changed 2 years ago by antoine

Isn't this saying that partial writes do happen?

It says they do happen on non-blocking pipes, but Twisted uses blocking pipes here. Also, it only applies to named pipes, not anonymous pipes (which Twisted uses). Moreover, the following note seems to discourage against non-blocking named pipes: "The nonblocking-wait mode is supported for compatibility with Microsoft LAN Manager version 2.0. This mode should not be used to achieve overlapped input and output (I/O) with named pipes".

(overlapped I/O is quite more of a PITA than non-blockingness, BTW)

8

  Changed 2 years ago by exarkun

It says they do happen on non-blocking pipes, but Twisted uses blocking pipes here.

Ah. That sounds pretty insane. Previously, I had only looked at the pipes set up _dumbwin32proc.py, which are non-blocking. I forgot to look at the pipe set up code in _win32stdio.py which does indeed make blocking pipes. That seems like a bug to me, but perhaps it is that way due to Windows limitations? I don't really know.

Also, it only applies to named pipes, not anonymous pipes (which Twisted uses).

True. However,  http://msdn.microsoft.com/en-us/library/windows/desktop/aa365141(v=vs.85).aspx states:

Anonymous pipes are implemented using a named pipe with a unique name. Therefore, you can often pass a handle to an anonymous pipe to a function that requires a handle to a named pipe.

There are some explicitly documented restrictions, such as anonymous pipes not supporting overlapped operations.

Moreover, the following note seems to discourage against non-blocking named pipes: "The nonblocking-wait mode is supported for compatibility with Microsoft LAN Manager version 2.0. This mode should not be used to achieve overlapped input and output (I/O) with named pipes".

Possibly so, but I am not inclined to give regard to such recommendations. As long as the platform supports the functionality and the functionality is useful to us, we should use it. If we can achieve the desired objectives without it - great; otherwise I don't see what's gained by not using it.

Crummy MSDN documentation aside, it should be possible to figure out what behavior is actually supplied with some testing. The implications of your testing seem to be that WriteFile does not do partial writes on NOWAIT anonymous pipes. If that's the case then we'll just have to go forward based on that. It would be nice to see an isolated demonstration of that, independent of the extra complexity of _pollingfile (so if I ever actually try to review this ticket, rather than just commenting on it about my lack of understanding, I will probably try to construct one, if no one else does it first). It seems that it would also be useful to stop setting the NOWAIT flag in _dumbwin32proc.py if that observation withstands scrutiny, since that setting would then appear to be a misleading no-op.

My apologies for the volume of non-review comments on this ticket. Your efforts here are much appreciated.

9

  Changed 2 years ago by antoine

  • cc solipsis@… added

Previously, I had only looked at the pipes set up _dumbwin32proc.py, which are non-blocking

Ah, no, you are right. _pollingfile._PollableWritePipe.init does also set its pipe in non-blocking mode (PIPE_NOWAIT) mode, but that doesn't seem to allow partial writes.

That code was apparently written by Glyph in ticket #654.

10

  Changed 2 years ago by antoine

By the way, I just tried to use CreateNamedPipe() to set all flags manually, but it didn't make things better.

11

  Changed 2 years ago by zseil

I see the same behavior on 32bit Vista and XP:

>>> import win32pipe
>>> import win32file
>>> reader, writer = win32pipe.CreatePipe(None, 0)
>>> win32pipe.SetNamedPipeHandleState(writer, win32pipe.PIPE_NOWAIT, None, None)
>>> errcode, written = win32file.WriteFile(writer, 4097 * "a")
>>> errcode
0
>>> written
0

That is, the WriteFile succeeds without error, but no data gets written.

12

  Changed 2 years ago by zseil

  • status changed from new to assigned
  • owner set to zseil

13

  Changed 2 years ago by zseil

  • status changed from assigned to new
  • owner changed from zseil to antoine
  • keywords review removed

I played a bit with this and it seems that partial writes are only possible when the reader has an overlapped read request pending:

>>> import pywintypes
>>> import win32file
>>> import win32pipe
>>> import winerror
>>> reader = win32pipe.CreateNamedPipe(
...     "\\\\.\\pipe\\pollingfile-test",
...     win32pipe.PIPE_ACCESS_INBOUND | win32file.FILE_FLAG_OVERLAPPED,
...     win32pipe.PIPE_TYPE_BYTE | win32pipe.PIPE_WAIT,
...     1, 4096, 4096, 0, None)
>>> writer = win32file.CreateFile(
...     "\\\\.\\pipe\\pollingfile-test",
...     win32file.GENERIC_WRITE,
...     0, None,
...     win32file.OPEN_EXISTING,
...     0, None)
>>> win32pipe.SetNamedPipeHandleState(writer, win32pipe.PIPE_NOWAIT, None, None)
>>> win32file.WriteFile(writer, "a" * 4097)
(0, 0)
>>> buffer = win32file.AllocateReadBuffer(4096)
>>> overlapped = pywintypes.OVERLAPPED()
>>> win32file.ReadFile(reader, buffer, overlapped)[0] == winerror.ERROR_IO_PENDING
True
>>> win32file.WriteFile(writer, "a" * 8193)
(0, 4096)

Pipe buffers can also be very small:

>>> reader, writer = win32pipe.CreatePipe(None, 1)
>>> win32pipe.SetNamedPipeHandleState(writer, win32pipe.PIPE_NOWAIT, None, None)
>>> win32file.WriteFile(writer, "x" * 2)
(0, 0)
>>> >>> win32file.WriteFile(writer, "x" * 1)
(0, 1)

Based on this and because StandardIO should be usable even when the parent process is not using reactor.spawnProcess, I don't think that hard coding _PIPE_BUFFER_SIZE to 4096 bytes is suitable. Instead, you could use win32pipe.GetNamedPipeInfo()[1] to find the pipe's write buffer size at pollingfile._PollableWritePipe creation time.

Other review comments:

  1. Please include unit tests testing just the write and writeSequence changes in twisted.internet.test_pollingfile.
  2. Your change to writeSequence also fixed a couple of unrelated bugs. I filled #5413 for _PollableWritePipe refactoring, the tests for those unrelated bugs should be added at least in that ticket.
  3. Please add doscrings to the changed methods. A link to twisted.internet.interfaces.ITransport methods would probably suffice.
  4. Style nit: please use twistedNamingConvention for block_size.

I didn't commit your patch to a branch because I assume that you prefer basing your patches on trunk; let me know if you would prefer working against a branch (e.g. if you wish a buildbot run). Thanks for working on this!

14

  Changed 2 years ago by exarkun

Instead, you could use win32pipe.GetNamedPipeInfo()[1] to find the pipe's write buffer size at pollingfile._PollableWritePipe creation time.

Factoring the code so that it can be directly unit tested and then creating a pipe or two with various buffer sizes to test it with would probably be nice.

15

  Changed 12 months ago by johnnypops

  • keywords review added

Attached is a patch to test_pollingfile.py which demonstrates the bug. Warning: this test times-out.

I have an abstract.FileDescriptor based fix to _pollingfile.py which resolves this and 5 other related bugs.

Changed 12 months ago by johnnypops

16

  Changed 12 months ago by johnnypops

I did a series of tests on the above issue (Windows 7 and XP) and the behavior of WriteFile() when writing to a non-blocking pipe is that as long as you don't try to write a block of data larger than the pipe write buffer size, you're good.

I've attached a patch to _pollingfile, with a test, that uses win32pipe.GetNamedPipeInfo() to get the current pipe write buffer size and then ensures that a block of data larger than this limit is broken-up into multiple blocks and retried later.

Changed 12 months ago by johnnypops

17

follow-up: ↓ 18   Changed 12 months ago by itamar

Empirical testing is good; still, what does MSDN say?

18

in reply to: ↑ 17   Changed 12 months ago by johnnypops

Replying to itamar:

Empirical testing is good; still, what does MSDN say?

Well, unfortunately, as discussed above, there is very little information on using WriteFile() with pipes using the non-blocking wait mode. The closest thing I've found so far is this (also quoted above):

"A WriteFile operation is affected by the wait mode of a pipe handle when there is insufficient space in the pipe's buffer. With a blocking-wait handle, the write operation cannot succeed until sufficient space is created in the buffer by a thread reading from the other end of the pipe. With a nonblocking-wait handle, the write operation returns a nonzero value immediately, without writing any bytes (for a message-type pipe) or after writing as many bytes as the buffer holds (for a byte-type pipe)."  Named Pipe Type, Read, and Wait Modes

With the best will in the world it's 'difficult' to reconcile the actual behavior and the above documentation - the pipe acts like a message-type pipe, but is reported as a byte-type pipe by GetNamedPipeInfo().

In an ideal world we could inspect the Windows source-code and determine that the documentation is in fact wrong.

Other options are confirming the observed behavior on, say Win2K, XP, Vista, 7 and 8 (I don't have easy access to 8) and proceeding until the behavior changes, using blocking pipes with threads to do the waiting, trying to use WaitForMultipleObjects() or the overlapped IO system.

WaitForMultipleObjects() would be a possibility except that pipes are mysteriously missing from the list of handles supported.

For what it's worth, I would think that a change that at least improves the behavior of _pollingfile as far as can be determined with unit tests is worth a shot.

19

  Changed 12 months ago by johnnypops

I also found this documentation for WriteFile() related to writing to non-blocking pipes:

"When writing to a non-blocking, byte-mode pipe handle with insufficient buffer space, WriteFile returns TRUE with *lpNumberOfBytesWritten < nNumberOfBytesToWrite."  WriteFile function

Which is at least possible to interpret as consistent with the observed behavior: the function succeeds and the number of bytes written is less than the number of bytes to write - although the number of bytes written is zero in our case.

20

  Changed 12 months ago by johnnypops

Hi, I did some further reading and more experiments using CreateNamedPipe() and CreateFile() to make an equivalent to CreatePipe() but allowing me to play with various flags etc. That all turned out to be unimportant except that it made me read the documentation for CreateNamedPipe() - apparently the anonymous pipes returned by CreatePipe() can be regarded as named-pipes with machine generated names.

The main clue about pipe behavior is here:

"Whenever a pipe write operation occurs, the system first tries to charge the memory against the pipe write quota. If the remaining pipe write quota is enough to fulfill the request, the write operation completes immediately. If the remaining pipe write quota is too small to fulfill the request, the system will try to expand the buffers to accommodate the data using nonpaged pool reserved for the process. The write operation will block until the data is read from the pipe so that the additional buffer quota can be released. Therefore, if your specified buffer size is too small, the system will grow the buffer as needed, but the downside is that the operation will block. If the operation is overlapped, a system thread is blocked; otherwise, the application thread is blocked."  CreateNamedPipe function, Remarks section

So, for a pipe *without* the PIPE_NOWAIT flag you can write blocks of data less-than-or-equal to the buffer size without blocking, if there is room in the buffer (this depends on whether data is being read from the pipe at the same time). Eventually, if you have a slow reader or try and write a data block bigger than the buffer size, WriteFile() blocks for the reasons described above.

I suspect that when a pipe *with* the PIPE_NOWAIT flag either fills up or a write bigger than the buffer size is made, instead of blocking, it does the only thing it can do, which is fail to write anything and return zero bytes written.

I did various experiments which confirm this behavior, so I'm hoping we can get on and fix this issue?

21

  Changed 12 months ago by itamar

Thanks for the in-depth research! That all sounds reasonable, yes.

22

  Changed 12 months ago by johnnypops

  • owner antoine deleted

23

  Changed 11 months ago by tomprince

  • branch set to branches/win32-WriteFile-buffer-5365
  • branch_author set to tomprince

(In [38707]) Branching to win32-WriteFile-buffer-5365.

24

  Changed 11 months ago by tomprince

(In [38708]) Apply tests from pollingfile-pipe-buffer-size.patch from johnnypops.

Refs: #5365

25

  Changed 11 months ago by tomprince

(In [38709]) Apply code changes from pollingfile-pipe-buffer-size.patch from johnnypops.

Refs: #5365

26

  Changed 11 months ago by tom.prince

  • owner set to johnnypops
  • keywords review removed

Thanks for your contribution.

  1. I notice you are now ignoring the error from WriteFile. Is there a reason for ignoring the error?
    • Looking closer, it appears that original code was also ignoring it.
  2. (optional) twisted doesn't go out of its way to avoid builtin names (re: `buffer_)
  3. TestPollablePipes.test_pipeBufferSize
    1. The docstring for doesn't really describe what it is testing. (See #6301)
      • In particular, once the patch is applied, the existing implementation won't timeout.
    2. If you want to decrease the timeout on the test, you should set the timeout attribute of the method, rather than the object. Alternatively, pass the timeout to runReactor.
      • Is there actually a reason to have a timeout? Once the patch is applied, it shouldn't be triggering anyway.
    3. I'd be inclined to do something like "x" * 2048 to get a long message to send.
    4. The names r and w are unclear. In any case, it would probably be better to have named functions, rather than lambdas.
    5. Rather than using LineReceiver (which might, incidentally, run afoul of line length limits), consider using result.append (where result is a list) as the data callback. (I guess this would mean calling .close immediately after .write.
  4. `PipeRunner
    1. Is there a reason that PipeRunner is a subclass of _PollingTimer, rather than just a function that creates an instance of _PollingTimer?
    2. The docsting should have epytext markup for its arguments and return value. It should also explain what it does (in such a way that explains why it is useful to have it factored out).
    3. '*CB` doesn't follow the naming standards.
    4. Is there a reasno that PipeRunner creates inheritable handles?
    5. It appears that the pipes created by PipeRunner are never closed or otherwise disposed.

Please resubmit for review after addressing the above points. (And please base any future patches against the branch)

The changes do appear to make the tests pass: - before:  win7-64  winxp-32 - after:  win7-64  winxp-32

Note: See TracTickets for help on using tickets.