Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#4814 defect closed duplicate (duplicate)

HTTPClient doesn't handle servers that use \n separators instead of \r\n

Reported by: Jason J. W. Williams Owned by:
Priority: normal Milestone:
Component: web Keywords:
Cc: Branch:
Author:

Description

t.w.h.HTTPClient just drops content if the sending server uses LF to terminate headers and lines instead of CRLF. It's broken behavior to issue LF only, but notable sites like http://news.ycombinator.com do it. The problem is exacerbated when you're using ProxyClient and some sites come up blank. Most proxies handle this busted behavior, and all browsers appear to be tolerant of it.

The actual issue is in the underlying LineReceiver, but since it's mostly an HTTP issue it seems like HTTPClient is the place to deal with it.

The attached patch gives HTTPClient it's own dataReceived method that splits on LF if it can't find CRLFs in the response. Patch includes tests for this behavior.

Attachments (1)

lf_only_http.patch (8.9 KB) - added by Jason J. W. Williams 7 years ago.
Patch to make HTTPClient tolerant of busted HTTP servers that use LF instead of CRLF for line endings.

Download all attachments as: .zip

Change History (7)

comment:1 Changed 7 years ago by Jason J. W. Williams

Owner: jknight deleted

Changed 7 years ago by Jason J. W. Williams

Attachment: lf_only_http.patch added

Patch to make HTTPClient tolerant of busted HTTP servers that use LF instead of CRLF for line endings.

comment:2 Changed 7 years ago by jknight

Resolution: duplicate
Status: newclosed

Duplicate of #2842. Also, how the hell is ycombinator still not speaking HTTP? I mean come on, seriously?

comment:3 Changed 7 years ago by Jason J. W. Williams

Resolution: duplicate
Status: closedreopened

I dont think this is a duplicate of #2842. That deals with : separators in the headers when you're already parsing lines. Without this patch you don't even get to lineReceived.

comment:4 Changed 7 years ago by Jean-Paul Calderone

Keywords: review removed
Resolution: duplicate
Status: reopenedclosed

We couldn't find any web servers that give out bad : separators, so #2842 is just about newlines now.

comment:5 Changed 7 years ago by Jason J. W. Williams

OK. So my patch will be considered for fixing that problem under #2842?

comment:6 in reply to:  2 Changed 7 years ago by Jason J. W. Williams

Replying to jknight:

Duplicate of #2842. Also, how the hell is ycombinator still not speaking HTTP? I mean come on, seriously?

Completely agree. Separate issue with YC/HN, but when you're proxying to them for login, they don't recognize content-length only Content-Length. Yes Virginia they're case-sensitive on their header recognition. Trying to figure out a workable solution for this.

Note: See TracTickets for help on using tickets.