[Twisted-Python] Safe Pickling using banana and jelly

Heiko Wundram heikowu at ceosg.de
Wed May 28 09:09:46 EDT 2003


On Wed, 2003-05-28 at 06:45, Andrew Bennetts wrote:
> Twisted Spread is intended to be safe and secure, because data from the net
> could be from anywhere, so it has to be robust in the face of dangerous
> data.

I know, I was just being a little smartass... ;)

> That's what __getstate__ and __setstate__ are for with pickle; IIRC Jelly
> supports that and also it's own 'getStateToCopyFor' or something along those
> lines.

Sure it is, but both of them are called only for the topmost object from
the inheritance tree, which, when I use private variables, has little
chance to see the variables that are defined below in a way that's
suitable. And returning __dict__ is a little strange, as reassigning to
__dict__ in the __setstate__ mehod won't fix what I need...

> I'm sure PB can handle this.  Have you read "PB Copyable: Passing Complex
> Types", http://twistedmatrix.com/documents/howto/pb-copyable? 

This is not about SenderPonds and ReceiverPonds, but about the simple
thing that LocalHost implements a few other data members, which should
not get sent over the wire, ever! (as e.g. private key, etc.) Of course,
I could use a getStateToCopyFor() to return only the data members that I
want to be returned, and be available in the remote object, but I think
that's kind of a hack... (remember, all my data elemens in all classes
are private) I think a solution which calls only the base classes
__serialize methods (in this example starting with Host) is cleaner.
(That's what aliases are all about.)

> > 3. Support for different serialization protocols. I guess I've not made
> I'm not quite sure what you're saying here.

The serializer supports different protocols right from the start. The
datastream is fixed format, but objects can choose to give unpickled
data storage different meaning according to the serialization protocol
that's chosen. What this works for: I have serialization to network and
serialization to database. To database shouldn't change timeouts, to
network should (what I stated last time). To database should store
private keys, to network shouldn't. Easy solution: Protocol 1 is the
network protocol, which has a alias from LocalHost to Host as specified
under 2, Protocol 2 is the Database protocol, which doesn't have this
alias.

> Again, read http://twistedmatrix.com/documents/howto/pb-copyable.  With PB
> you can control exactly what data is sent over the wire.  You also get
> Referenceables, Copyables and Cacheables, which is considerably richer than
> a simple object protocol that just passes the occasional object-graph.

I'm not using PB, and I don't want to in this project. PB is fine, when
you have to deal with complex objects (just as Pyro is in that case),
but normally, you only have to send simple objects over the wire, and I
guess you're much more worried about the fact that objects might be
instantiated wrongly/have to do some parameter checking/get corrupted on
transport/willfully. And control over this "dark side of networking" is
exactly what this pickler tries to achieve by giving you full control
over the pickling and unpickling process.

> TCP guarantees the data isn't corrupted on the way through.  Use SSL to 
> guarantee that the traffic isn't being sniffed or tampered with in transit.
> PB has an authentication step so that you can know who the clients are.
> Between all this, I'm not sure what problem your signatures are solving that
> isn't already solved?

The transport I use isn't TCP, but UDP. And rewriting PB to use UDP
would be a little strange... :) Anyway, what I actually meant by
signatures: You can use public key encryption/signing to encrypt members
of your data stream. Classes can request to have certain members
encrypted/signed in the serialization stream, just as you can for the
whole pickle, and when unserializing, you can specify what to do when
non-decryptable packets or invalid signature packets are found, without
discarding the whole pickle (which is useful, if you need to pass around
some structures which have private "text" content, but should
nevertheless be readable as to timeout and the like, the class being
called "Data").

> As far as I can see (although I'm no PB expert), PB satisfies 1 and 4.  What
> makes you think it doesn't?

PB could certainly be crafted to satisfy 1+4, but it certainly doesn't
do UDP transport, which I need... And I don't think any PB code which
handles most of the points above is as clean as it is with my own
serializer.

Which isn't bad, of course! PB was just designed with completely
different ideas in mind...

Again just my 5 cents... :)

Heiko.





More information about the Twisted-Python mailing list