Opened 2 years ago

Closed 23 months ago

#6110 enhancement closed fixed (fixed)

Port twisted.web.http to Python 3

Reported by: exarkun Owned by: exarkun
Priority: normal Milestone: Python 3.3 Minimal
Component: web Keywords:
Cc: jknight Branch: branches/http-py3-6110-3
(diff, github, buildbot, log)
Author: exarkun Launchpad Bug:

Description

Unfortunately, twisted.web.client relies on the Twisted Web server for a great deal of its unit tests, therefore the server needs to be ported before the client can be ported.

Change History (17)

comment:1 Changed 2 years ago by DefaultCC Plugin

  • Cc jknight added

comment:2 Changed 2 years ago by exarkun

  • Author set to exarkun
  • Branch set to branches/http-py3-6110

(In [36161]) Branching to 'http-py3-6110'

comment:3 Changed 2 years ago by exarkun

  • Branch changed from branches/http-py3-6110 to branches/http-py3-6110-2

(In [36207]) Branching to 'http-py3-6110-2'

comment:4 Changed 2 years ago by exarkun

  • Keywords review added
  • Owner changed from exarkun to itamar

Unfortunately twisted/web/http.py has rather poor test coverage. It's also a really big module. If we can get away with it, I'd like to say that it's "ported enough" for the minimal milestone now - twisted/web/test/test_http.py is fully ported and passes in the branch, so at least the parts of http.py that it covers are ported and working. I don't actually know if this is sufficient or not, but I hope that it is, considering that what we really need for the minimal milestone is twisted.web.client.getPage.

I filed a bunch of tickets for the missing test coverage and I added twisted.web.http to the partially ported modules list rather than the completely ported modules list.

Tests pass on Python 2, but I've run into a problem on Python 3 that I don't understand. Both trunk and this branch fail to run any tests on Python 3 because:

Traceback (most recent call last):
  File "admin/run-python3-tests", line 25, in <module>
    unittest.main(module=None, argv=["run-python3-tests", "-v"] + testModules)
  File "/home/exarkun/Projects/cpython/3.3/Lib/unittest/main.py", line 124, in __init__
    self.parseArgs(argv)
  File "/home/exarkun/Projects/cpython/3.3/Lib/unittest/main.py", line 168, in parseArgs
    self.createTests()
  File "/home/exarkun/Projects/cpython/3.3/Lib/unittest/main.py", line 175, in createTests
    self.module)
  File "/home/exarkun/Projects/cpython/3.3/Lib/unittest/loader.py", line 137, in loadTestsFromNames
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/home/exarkun/Projects/cpython/3.3/Lib/unittest/loader.py", line 137, in <listcomp>
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/home/exarkun/Projects/cpython/3.3/Lib/unittest/loader.py", line 96, in loadTestsFromName
    module = __import__('.'.join(parts_copy))
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/test/test_default.py", line 13, in <module>
    from twisted.internet import default
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/default.py", line 56, in <module>
    install = _getInstallFunction(platform)
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/default.py", line 44, in _getInstallFunction
    from twisted.internet.epollreactor import install
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/epollreactor.py", line 23, in <module>
    from twisted.internet import posixbase
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/posixbase.py", line 24, in <module>
    from twisted.internet import error, udp, tcp
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/tcp.py", line 29, in <module>
    from twisted.internet._newtls import (
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/internet/_newtls.py", line 20, in <module>
    from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/protocols/tls.py", line 53, in <module>
    from twisted.python.reflect import safe_str
  File "/home/exarkun/Projects/Twisted/branches/http-py3-6110-2/twisted/python/reflect.py", line 517
    print 'unknown type', type(start), start
                       ^
SyntaxError: invalid syntax

This seems to be due to the reactor merge? Or tcp? Maybe it's fixed by the tls branch? I don't know, I'm really confused (also because buildbot is green).

Anyhow, build results.

comment:5 Changed 2 years ago by itamar

I bet you installed pyopenssl. SSL modules are imported if pyopenssl is available, which leads to problems like this (probably you wouldn't have this issue if my tls branch was merged.) Note that pyopenssl install on buildslave is broken enough (built against py3.2) that it's effectively unimportable, which is why buildbot is green (Barry will fix this as soon as 13.04 branch or whatever is available, unless we ask him to try to do it sooner - not sure what calendar date that implies).

Back at work tomorrow, when I will review.

comment:6 Changed 2 years ago by exarkun

I bet you installed pyopenssl.

Ugh, yes. Thanks, that's a big help.

comment:7 Changed 2 years ago by itamar

  • Status changed from new to assigned

comment:8 Changed 2 years ago by itamar

  • Keywords review removed
  • Owner changed from itamar to exarkun
  • Status changed from assigned to new
  1. loopback has been ported in the interim, so no need to skip the loopback-using test.
  2. News file.
  3. What's with the commented out persistent HTTP test (line 312 in the tests)? If you uncomment that line and fix it to use _prequest does it work? If so maybe we should enable it, it looks like we ought to support that just fine.
  4. Add a comment to _twistedpython3.py explaining why it's not fully ported (lack of test coverage).
  5. testMissingContentDisposition skip message is missing the twisted ticket number.
  6. The docstring for QueryArgumentsTestCase.test_urlparse is inaccurate for Python 3. I'm also unhappy with the way it's testing on Python 2; on Python 2 urlparse could return unicode and tests would still pass, which seems wrong.
  7. RequestsTests still has a number of tests where headers are native string rather than bytes, starting at test_getHeader.
  8. test_setHost* are setting a native string host, rather than bytes, and checking native string headers. This reflects some missing string to byte conversions in http.py.
  9. Should we enforce byteness of header keys and values? It seems like we should, this'd catch a whole host of problems.
  10. ClientStatusParsing tests are still unicode.
  11. MultilineHeadersTestCase still has lots of native string usage (rather than bytes) in various places.
  12. In http.py, using native string still, rather than bytes (I'm assuming headers are always bytes). Possibly some of these were intentional due to lack of tests and/or lack of usage by the target application, but it seems like we may as well try to be complete. I also expect porting other test modules would catch at least some of these.
    1. parseContentRange
    2. StringTransport
    3. HTTPClient
    4. Request._sendError
    5. Request.finish
    6. Request.write (the etag)
    7. Request.addCookie
    8. Request.setLastModified
    9. Request.redirect
    10. Request.setETag
    11. Request.getRequestHostname
    12. Request.setHost
    13. Request._authorize (the default values it sets for user and password)
    14. HTTPChannel.__header default empty value in a few places (class definition, also line 1566)
    15. \n in line 1573
  13. I suspect the http factory logging is mixing strings and bytes badly in the log method and its functional dependencies.
  14. http.py docstrings need updating to say bytes:
    1. Request.getHeader
    2. Request.setHeader
    3. Request.getUser
    4. Request.getPassword
    5. Request.checkPersistence

comment:9 follow-up: Changed 23 months ago by exarkun

  • Keywords review added
  • Owner changed from exarkun to itamar

loopback has been ported in the interim, so no need to skip the loopback-using test.

Unskipped in r36250. Getting them to pass required a lot of changes, so lots of other fixes in that revision as well.

News file.

Done in r36295

What's with the commented out persistent HTTP test (line 312 in the tests)? If you uncomment that line and fix it to use _prequest does it work? If so maybe we should enable it, it looks like we ought to support that just fine.

Dunno. It fails. I unintentionally deleted it, but put it back (commented out) in r36296. I filed #6167.

Add a comment to _twistedpython3.py explaining why it's not fully ported (lack of test coverage).

Done in r36297.

testMissingContentDisposition skip message is missing the twisted ticket number.

Done in r36298.

The docstring for QueryArgumentsTestCase.test_urlparse is inaccurate for Python 3. I'm also unhappy with the way it's testing on Python 2; on Python 2 urlparse could return unicode and tests would still pass, which seems wrong.

Fixed the docstring in r36298. Can you clarify the rest of this point? It looks like the test handles that case well enough, by checking for unicode coming back from the stdlib implementation and doing the encoding and also asserting that all the values returned by our implementation are instances of bytes.

RequestsTests still has a number of tests where headers are native string rather than bytes, starting at test_getHeader.

Some of that fixed in the first revision I linked to above, the rest in r36300.

test_setHost* are setting a native string host, rather than bytes, and checking native string headers. This reflects some missing string to byte conversions in http.py.

Indeed. Fixed in r36301.

Should we enforce byteness of header keys and values? It seems like we should, this'd catch a whole host of problems.

I guess probably? Maybe as a separate ticket though?

ClientStatusParsing tests are still unicode.

Fixed in r36302.

MultilineHeadersTestCase still has lots of native string usage (rather than bytes) in various places.

Fixed earlier, in the first revision linked.

In http.py, using native string still, rather than bytes (I'm assuming headers are always bytes). Possibly some of these were intentional due to lack of tests and/or lack of usage by the target application, but it seems like we may as well try to be complete. I also expect porting other test modules would catch at least some of these.
I suspect the http factory logging is mixing strings and bytes badly in the log method and its functional dependencies.

I'll file some tickets for porting some more test modules, then? But I filed a lot of tickets for improving test coverage of http.py already. Writing those tests and making them pass would be the best solution (it's just not all that clear that we want to do that work as part of this milestone).

http.py docstrings need updating to say bytes:

Done in r36303.

One further change to fix a couple miscellaneous issues, r36304.

Latest build results.

comment:10 in reply to: ↑ 9 Changed 23 months ago by itamar

  • Keywords review removed
  • Owner changed from itamar to exarkun

Replying to exarkun:

The docstring for QueryArgumentsTestCase.test_urlparse is inaccurate for Python 3. I'm also unhappy with the way it's testing on Python 2; on Python 2 urlparse could return unicode and tests would still pass, which seems wrong.

Fixed the docstring in r36298. Can you clarify the rest of this point?

Sorry, I think I misread something. The test looks fine as is.

Should we enforce byteness of header keys and values? It seems like we should, this'd catch a whole host of problems.

I guess probably? Maybe as a separate ticket though?

OK, please file a ticket then.

I'll file some tickets for porting some more test modules, then? But I filed a lot of tickets for improving test coverage of http.py already. Writing those tests and making them pass would be the best solution (it's just not all that clear that we want to do that work as part of this milestone).

Well, you'll be porting twisted.web.server and twisted.web.client anyway, right? Let's see how far that gets you. I would suggest when you do those that you add b prefix to all strings for server and client code respectively in twisted.web.http, at least in the code paths that all requests are likely to go through. E.g. there's a bunch of places in HTTPClient which are essential to it working right. I assume the additional test modules you'll be doing anyway (e.g. test_client) will catch most of these, but just in case they don't...

File the ticket mentioned above, then merge.

comment:11 Changed 23 months ago by exarkun

Filed #6168 for header keys/values.

comment:12 Changed 23 months ago by exarkun

  • Branch changed from branches/http-py3-6110-2 to branches/http-py3-6110-3

(In [36310]) Branching to 'http-py3-6110-3'

comment:13 Changed 23 months ago by exarkun

(In [36312]) Fix test method name to comply with naming convention

refs #6110

comment:14 Changed 23 months ago by exarkun

(In [36313]) Make the test data more realistic - chunked Transfer-Encoding was introduced in HTTP 1.1

refs #6110

comment:15 Changed 23 months ago by exarkun

(In [36314]) The implementation also closes the connection: add an assertion related to that.

refs #6110

comment:16 Changed 23 months ago by exarkun

(In [36315]) Port the test and functionality to Python 3

refs #6110

comment:17 Changed 23 months ago by exarkun

  • Resolution set to fixed
  • Status changed from new to closed

(In [36316]) Merge Twisted http-py3-6110-3

Author: exarkun
Reviewer: itamar
Fixes: #6110

Port the tested parts of twisted.web.http to Python 3.

Note: See TracTickets for help on using tickets.