[Twisted-Python] Sending other things than strings in UDP packets
exarkun at divmod.com
exarkun at divmod.com
Fri Oct 8 18:49:53 EDT 2004
On Fri, 8 Oct 2004 14:42:12 -0700, Paul Campbell <paul at ref.nmedia.net> wrote:
>> Message: 5
> > Date: Thu, 7 Oct 2004 08:53:08 +1100
> > From: Christopher Armstrong <radeex at gmail.com>
> > Subject: Re: [Twisted-Python] Sending other things than strings in UDP
> > packets
> > To: Twisted general discussion <twisted-python at twistedmatrix.com>
> > Message-ID: <60ed19d404100614531935658a at mail.gmail.com>
> > Content-Type: text/plain; charset=US-ASCII
> Christopher Armstrong <radeex at gmail.com> wrote:
> > Ergh. Please don't do anything _close_ to suggesting this. This is not
> > the "python way", it is the "stupid, insecure,
> > let-people-rm-rf-your-home-directory way".
> You apparently read part of the message and then you failed to read the
> next paragraph, right? At the risk of being redundant, let me reiterate:
> "Read the documentation on the pickle module for more information. And be
> forewarned: pickle will dump/load ANYTHING. For safety reasons, there's also
> a 'safe_pickle' variant floating around."
> I guess writing instructive and marginal (and probably nonworking) code for
> helping out newbies is not what this mailing list is all about. I thought I
> said "nonworking", "no error checking", and "lots of issues" enough times
> to get the point across. Sorry, I'll "mail.compose.elitist_mode=true" next
> time and give some flippant answer like "UDP sucks dude. Just use PB under
I'm sure a great many people would appreciate it if you didn't (and I hope you take no insult if I go so far as to point out that if I were to judge solely on the content of this email, I would say you have already enabled this mode. I'm sure it is just a fluek, though).
> At the risk of leaving you hanging with regards to whether such variants
> actually exist, below are pointers to two that can be used off the shelf, and
> possibly a third already contained within the twisted code base.
> Here's one variant that includes the a "safe pickle" call:
I think there is a simple misunderstanding here. A great many people assume that "pickle" applies to one of the two stdlib modules, "pickle" or "cPickle". You seem to be using it to refer to any arbitrary or semi-arbitrary serialization module.
Both usages certainly have their place, but when people confuse them, problems can often ensue!
> Incidentally, the protocol contains a lot of the extras that I mentioned
> my stripped-down code was lacking (as well as a few small bugs). It also
> handles long messages and retries as well within UDP. It has an interesting
> "microprotocol" sort of structure (where each layer of the protocol builds
> on the previous one). Read it in addition to the first couple functions
> that handle pickling/unpickling.
I'll take your word on this, as it does not seem central to the particular issue now at hand.
> Another "safe pickle" module is buried in the code for "thecircle"
> (www.thecircle.org.au). Just download it and rip out the "safe_pickle.py"
> module from circlelib. It is stand-alone, and designed for UDP transportation
> (although not currently using Twisted).
Again, a likely source of confusion. thecircle's "safe_pickle.py" module is not actually related to pickle at all, beyond the fact that both it and pickle are used for serializing objects. I won't hold this against you, though :) thecircle's authors should be ashamed for their lack of creativity in module naming!
> I haven't dug really deep, but banana (part of PB) appears to be essentially
> yet another incarnation of exactly the same idea. The code pattern looked
> identical to the two pieces of code I just mentioned. In fact, I haven't
> looked at it but I suspect that even pickle itself has the same pattern,
> other than being more generalized (it will handle executables and instances,
> while the safe variants will reject that).
Banana is closer to the marshal module than the pickle module. Take Jelly and Banana together though, and you have something that is quite similar. Jelly + Banana together is _not_ pickle, though, it just does something similar. Why does this matter? Well...
> The code for all of these modules has an identical structure. It takes
> a structure and walks down it. It reads each piece and codes it in a
> "Type+data" format. It rejects anything that it can't inherently decode
> without aid (such as class instances). In those cases, at least the banana
> variant does allow the possibility to kick it up to a higher level (via
> Jelly) to handle user-level structures.
No, it does not have an identical structure. It is similar, no doubt, but there are many differences worth noting. Jelly and Banana, for example, have _no_ mechanism which will allow an arbitrary function to be specified for execution by the _serializer_ in the _deserializer's_ environment. Pickle does. This is just one of many important differences.
> The "unpickle" code does exactly the same thing except in reverse, converting
> the coded data back into a structure. There are standards (S-expressions or
> XML) for the format itself but I haven't seen any truly compelling reasons to
> follow those. They seem to add lots of overhead without any additional benefit.
Without going into the advantages and disadvantages of XML and S-expressions, let me just point out this:
Whether you use the pickle format, or any other format (like xml or s-expressions), is mostly irrelevant to the main concern raised in this thread: security. Pickle was not written to be secure. It is not maintained with security in mind. It has some features which take it some distance towards the goal of being secure, but how close they get it is hard to say. Personally, I would not use any serialization tool as complex as pickle in a security sensitive environment without first having it audited by some very smart people. The CPython developers have flat out stated that they are not focused on making it secure. That's fine. I'm not going to demand they make it completely bulletproof. I'll happily continue using it _far away from untrusted data sources_ and use an alternative, such as Jelly + Banana, for the cases where I need to communicate with untrusted parties.
I recommend everyone else do the same.
More information about the Twisted-Python