Opened 16 years ago

Closed 11 years ago

#1713 defect closed wontfix (wontfix)

Web2 test failure on win32 (SSLServerTest.testLingeringClose)

Reported by: teratorn Owned by:
Priority: normal Milestone:
Component: web2 Keywords: win32
Cc: teratorn, Trent.Nelson, Thijs Triemstra Branch: branches/testLingeringClose-win32-1713
branch-diff, diff-cov, branch-cov, buildbot
Author:

Description (last modified by Jean-Paul Calderone)

The test twisted.web2.test.test_http.SSLServerTest.testLingeringClose fails reliably on win32 with:

===============================================================================
[TODO]: twisted.web2.test.test_http.SSLServerTest.testLingeringClose

Reason: 'buffering kills the connection too early; test this some other way'
Failure: twisted.trial.unittest.FailTest: Error output:
>> Making ssl request to port 3021

>> Sending lots of data

Traceback (most recent call last):

  File "c:\buildslave\win32-select\W32-full2.4-select\Twisted\twisted\web2\test\simple_client.py", line 27, in ?

    send("X"*1000000)

socket.error: (10054, 'Connection reset by peer')
===============================================================================

The test case uses simple_client.py as a seperate process to write data to the server. It currently writes 1,000,000 bytes to the server. Changing the number of bytes to 10,000, for example, causes the test to pass. Seems likely that the OS is closing the connection too soon due to buffers filling up.

First question, what is this test case supposed to test exactly? It seems very similar to testBasicWorkingness in many ways.

Foom, I'm assigning to you because it seems you wrote the orignal version of this test. Any ideas? Should we CC anyone else?

Change History (16)

comment:2 Changed 16 years ago by Jean-Paul Calderone

Description: modified (diff)

comment:3 Changed 16 years ago by jknight

Owner: changed from foom to jknight

This test is supposed to test that this exact thing that's failing doesn't happen. A "stupid" client connects to a server which sends a response and closes the connection before reading all the input. Thus, the client is still trying to write data to the server.

The server ought not cause the client to fail with an error, as doing so will prevent the client from ever getting the response that the server sent. The server does that by doing a half close of its send side, and then reading and simply discarding data from the client. As a safety measure, the server has a timeout of 20 seconds on that state so if the client sends a crapload of data, it gets disconnected anyhow.

That the client gets a send error on windows means that either windows is really slow or that something has gone wrong with this process such that the server isn't actually reading more data from the client. I can help out, but someone else is going to have to track down why this process doesn't work on windows.

comment:4 Changed 16 years ago by teratorn

Thanks, I'm looking in to it. Btw, does anyone know of a program that will capture loopback network traffic on Windows?

comment:5 Changed 16 years ago by Glyph

Owner: changed from jknight to teratorn

comment:6 Changed 16 years ago by Glyph

You cannot half-close a connection using SSL. I am guessing this is actually a behavior of OpenSSL that happens to work by accident on UNIX but works properly on Windows.

From the most recent TLS RFC, section 7.2.1:

Unless some other fatal alert has been transmitted, each party is required to send a close_notify alert before closing the write side of the connection. The other party MUST respond with a close_notify alert of its own and close down the connection immediately, discarding any pending writes. It is not required for the initiator of the close to wait for the responding close_notify alert before closing the read side of the connection.

I believe what we are seeing is OpenSSL's mechanism for reporting that it had to (as per the standard) close the connection before its write completed.

This is arguably a defect in the TLS spec, but it is there, and we aren't going to implement our own TLS layer anyway, so we're sort of constrained by OpenSSL's implementation decisions. I think that we should drop this feature, over TLS transports anyway: half-close is a broken corner-case in HTTP, and should never be used in new protocols.

comment:7 Changed 16 years ago by teratorn

Good info Glyph, thanks. I've gotten busier lately, but I'll try to do something with this sometime soonish.

comment:8 Changed 16 years ago by jknight

Is the windows build perhaps using a different version of OpenSSL? I have a hard time believing that'd be different in openssl for windows and unix.

Also, I disagree that half close is a broken corner case. It's necessary to support in some form because of synchronization issues between hosts. The reason it's supported at an API level for TCP is because it is necessary to support it at a protocol level, so might as well expose it. It's also necessary to support at a protocol level for SSL, and in fact it is supported. That the RFC says it shouldn't be supported at the API level is pretty insane if you ask me.

comment:9 Changed 16 years ago by Jean-Paul Calderone

Both the Python 2.4 buildslave nad the win32 buildslave have OpenSSL 0.9.8 and PyOpenSSL 0.6.

comment:10 Changed 16 years ago by teratorn

Status: newassigned

comment:11 Changed 15 years ago by teratorn

Owner: teratorn deleted
Status: assignednew

comment:12 Changed 14 years ago by Trent.Nelson

Cc: Trent.Nelson added

comment:13 Changed 12 years ago by Thijs Triemstra

Author: teratorn
Branch: branches/testLingeringClose-win32-1713
Cc: Thijs Triemstra added

comment:14 Changed 12 years ago by teratorn

Author: teratorn

Since I just noticed the activity on this ticket it's worth updating from my perspective.... I gave up a long time ago on figuring this behavior out... I must have studied the code, poured over my systems with as many Sysinternals tools as I could, for several days on end... I just hate to give up without making the slightest bit of progress... but I just never got _anywhere_ with it. Post any problems you encounter, as it's possible I might be struck by some revelation, but otherwise good luck to whoever attempts to resolve this. (thijs?)

One problem I had was I could never find a good enough tool that would let me capture traffic on the loopback interface on Windows... I remember feeling that this would have been a great help...

comment:15 Changed 11 years ago by <automation>

comment:16 Changed 11 years ago by washort

Resolution: wontfix
Status: newclosed
Note: See TracTickets for help on using tickets.