Opened 8 years ago

Last modified 8 years ago

#6694 defect new

twisted.web.http.Request.parseCookies accepts malformed cookies

Reported by: Richard Wall Owned by:
Priority: normal Milestone:
Component: web Keywords:
Cc: jknight Branch:
Author:

Description

t.w.Request.parseCookies seems inadequate in various ways

  1. Allows empty keys " = "
  2. Allows spaces in values "foo = bar baz; "
  3. Misinterprets quoted keys '"foo=bar";baz=qux'

RFC6265 section-4.1.1 spells out exactly what should and should not be allowed in cookie headers and in the individual cookies.

4.1.1. Syntax


   Informally, the Set-Cookie response header contains the header name
   "Set-Cookie" followed by a ":" and a cookie.  Each cookie begins with
   a name-value-pair, followed by zero or more attribute-value pairs.
   Servers SHOULD NOT send Set-Cookie headers that fail to conform to
   the following grammar:

...

 cookie-pair       = cookie-name "=" cookie-value
 cookie-name       = token
 cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
 cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash
 token             = <token, defined in [RFC2616], Section 2.2>

...

The Cookie and cookielib modules seem to have much better cookie parsing.

We should just re-use those.

Definition: cookielib.split_header_words(header_values)
Docstring:
Parse header values into a list of lists containing key,value pairs.

The function knows how to deal with ",", ";" and "=" as well as quoted
values after "=".  A list of space separated tokens are parsed as if they
were separated by ";".

If the header_values passed as argument contains multiple values, then they
are treated as if they were a single value separated by comma ",".

This means that this function is useful for parsing header fields that
follow this syntax (BNF as from the HTTP/1.1 specification, but we relax
the requirement for tokens).
In [36]: cookielib.split_header_words([b'foo=bar; baz=qux; = ;'])
Out[36]: [[('foo', 'bar'), ('baz', 'qux')]]

Although not perfect...

In [37]: cookielib.split_header_words([b'"abc "="def"'])
Out[37]: [[('"abc', None), ('"', 'def')]]

Alterntively there's the Cookie module.

In [38]: Cookie.SmartCookie(input=b'foo=bar; baz=qux; = ;')
Out[38]: <SmartCookie: baz='qux' foo='bar'>

This ticket was raised while working on #6119

Change History (1)

comment:1 Changed 8 years ago by DefaultCC Plugin

Cc: jknight added
Note: See TracTickets for help on using tickets.