[Twisted-Python] HTML shoudl not be baked into twisted.web HTTP implementation

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Wed Dec 9 12:34:34 EST 2009

On 04:52 pm, jared.gisin at isilon.com wrote:
>I'm writing a HTTP server that exposes various resources as an API.
>Unless I'm missing something, twisted's HTTP protocol implementation is
>in twisted.web.http.
>The problem with this package is that it's inexplicably wrapped up in
>HTML. HTML has nothing to do with HTTP as a whole. Sure, HTML is often
>what HTTP requests return, but there's no reason why it should nor is
>there any RFC that says it should. HTTP request can return anything.

It's not inextricable.  A few relatively simple patches would probably 
be sufficient to extricate the HTML from the HTTP. :)

The reason for these things being mixed up is that it made sense at the 
time and provided reasonable behavior for some actual use-cases.  That 
doesn't mean the behavior is right, but hopefully it should be clear why 
it was implemented.  The reason it hasn't been changed is only that no 
one has been bothered enough by it has come along to change it.
>The software I'm writing is a programmatic interface. One never uses a
>web browser, so things such as displaying tracebacks
>(twisted.web.util.formatFailure) in HTML format is completely wrong.
>When implementing HTTP, why assume the client always wants HTML. It 
>completely wrong from these modules and libraries to be so full of HTML
>output. HTML output should be provided as a separate config or option
>for twisted.web. In this case, why not just dump the traceback directly
>to the HTTP entity-body? As a consumer of twisted.web, I  should not
>have to battle with the hard-coded HTML output of this library. If I
>want the library to dump things in HTML output, I should have an option
>to tell it to do that (and I should be able to better customize the
>HTML), but I should not get HTML by default.

I agree.  The first step to take is probably to identify the precise 
places in the code where HTML is being generated and emitted (this may 
just be Request.processingFailed, but I haven't looked around for others 
lately) where one might not want to deal with HTML.  The next step would 
be to file a ticket enumerating these.  After that, a patch which allows 
these behaviors to be overridden can be submitted, reviewed(, revised, 
submitted, ...), and applied.

This will be valuable even for people who like HTML, since it will let 
them customize the HTML to suit their preferences.


More information about the Twisted-Python mailing list