[Twisted-Python] treq POST abborting with: err: ('Could not adapt', '{"....", "..."} <InterfaceClass twisted.web.iweb.IBodyProducer>)

Glyph Lefkowitz glyph at twistedmatrix.com
Sat Jan 7 16:57:15 MST 2017


> On Jan 7, 2017, at 4:00 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
> 
> 
>> On 7 Jan 2017, at 02:18, Tristan Seligmann <mithrandi at mithrandi.net> wrote:
>> 
>> On Sat, 7 Jan 2017 at 03:23 Glyph Lefkowitz <glyph at twistedmatrix.com> wrote:
>> 
>> Maybe we should support unicode for the body as well.  We can set the charset in the mime-type and everything so that it will be properly intelligible by the server, which doesn't happen if the user manually encodes like this.
>> 
>> Oh, forgot to comment on this point; in the specific case of JSON, it isn't necessary to specify UTF-8 in Content-Type[1], but for HTML or XML it's a pretty good idea. However, I'm not sure if it's possible to modify Content-Type in a generic fashion to make this sort of thing work; for example, "Content-Type: application/octet-stream; charset=UTF-8" is nonsense. I'll defer to some HTTP experts here ;)
> 
> This is really not simple, for the reason that many MIME types do not define a charset extension. In the case of JSON, it’s not just not necessary to specify UTF-8 in Content-Type, but the standard explicitly does not define charset for the JSON content type[0]:

I see you your standards pedantry, and raise you!

MIME defines content-type to always have a charset:

https://tools.ietf.org/html/rfc1521#section-4 <https://tools.ietf.org/html/rfc1521#section-4>

>>> Among the defined parameters is a "charset" parameter by which the character set used in the body may be declared.

and from the spec you're citing,

>>> Adding one really has no effect on compliant recipients


Given that we'd always choose utf-8 anyway it would be fine.

>> Note:  No "charset" parameter is defined for this registration. Adding one really has no effect on compliant recipients.
> 
> Strictly a completely compliant implementation would not emit charset details for content types that have no charset registration. Such a thing is pretty tricky to do. Knowing that, it’s probably best to YOLO your way though, or forbid unicode in bodies.

A bigger part of the problem here is that treq has no way of knowing that the characters you're sending are JSON at all, let alone what encoding you want them in.  So if we actually want to do MIME stuff, we'd probably want to have treq be the layer to call treq.dumps on the dict anyway, so that it knows what it's dealing with.  Automatic treatment of unicode would likely set the type to text/plain;charset=utf8.

-glyph

-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20170107/f9802b3d/attachment-0002.html>


More information about the Twisted-Python mailing list