Opened 14 months ago

Last modified 13 months ago

#9739 defect new

Multipart returns 400 Bad Request on Python 3.7+

Reported by: Areeb Jamal Owned by:
Priority: high Milestone:
Component: web Keywords: multipart, python3.7, bad request
Cc: Branch:


Related: Pull Request:

Downstream Issue:


Content-Disposition: form-data; name="content"
Content-Type: multipart/form-data; charset=utf-8



Content-Type: multipart/form-data; boundary=d66f495a-c4d1-487c-9277-9ab1a20001cc

Twisted returns 400 Bad Request on Python 3.7+, works fine on Python 3.6-

In my brief debugging, Twisted tried to parse an empty line b'' as valid boundary and throws, thus returning 400 Bad Request. That doesn't happen on Python 3.6-

This is where the exception is raised:

Hence, the problem possibly originates from this commit -

Hence, we are capped to Python 3.6 and can't move forward with upgrading to 3.7+

Any person using a library depending on twisted (like Django Channels) is impacted

Change History (4)

comment:1 Changed 13 months ago by Tom Most

comment:2 Changed 13 months ago by Tom Most

In Python 3.7 cgi.parse_multipart() was changed to use FieldStorage internally:

This is what prompted the changes in

I can reproduce this without any Twisted code:

Python 3.7.5 (default, Nov  7 2019, 10:50:52) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cgi
>>> payload = """--d66f495a-c4d1-487c-9277-9ab1a20001cc
... Content-Disposition: form-data; name=\"content\"
... Content-Type: multipart/form-data; charset=utf-8
... Hello World
... --d66f495a-c4d1-487c-9277-9ab1a20001cc--"""
>>> from io import BytesIO
>>> content = BytesIO(payload.encode())
>>> pdict = {'boundary': b'd66f495a-c4d1-487c-9277-9ab1a20001cc', 'CONTENT-LENGTH': len(payload)}
>>> cgi.parse_multipart(content, pdict, encoding='utf8', errors='surrogateescape')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/", line 222, in parse_multipart
    environ={'REQUEST_METHOD': 'POST'})
  File "/usr/lib/python3.7/", line 489, in __init__
    self.read_multi(environ, keep_blank_values, strict_parsing)
  File "/usr/lib/python3.7/", line 666, in read_multi
    self.encoding, self.errors, max_num_fields)
  File "/usr/lib/python3.7/", line 489, in __init__
    self.read_multi(environ, keep_blank_values, strict_parsing)
  File "/usr/lib/python3.7/", line 616, in read_multi
    raise ValueError('Invalid boundary in multipart form: %r' % (ib,))
ValueError: Invalid boundary in multipart form: b''
Last edited 13 months ago by Tom Most (previous) (diff)

comment:3 Changed 13 months ago by Tom Most

This header from your example doesn't look valid to me:

Content-Type: multipart/form-data; charset=utf-8

It is missing the boundary parameter, which is required. This explains the b''. Also note that charset isn't defined for multipart/form-data (see that RFC --- charsets in form data are weird).

I'd guess that the behavioral change is due to bpo-29979. One aspect of the "consistency" it achieves is support for nested multipart, whereas the old implementation ignored the {{Content-Type}} of chunks as far as I can see.

What is generating this malformed input?

comment:4 Changed 13 months ago by Areeb Jamal


However, the request header does have a boundary, but parts in multipart don't. Not sure if it valid for parts headers to exclude boundary or not

Note: See TracTickets for help on using tickets.