[Twisted-Python] Re: Disabling PB (de)serialization

Nicola Larosa nico at tekNico.net
Thu Sep 8 00:35:04 MDT 2005


>> Mostly, *very* long lists of small objects, each containing a few numbers
>> and short strings.

> What order of magnitude are we talking about? Would something like
> [(1,12,"foo","bar") for i in range(10000)] be close? If so, I'll use this as
> one of the benchmark cases.

Make that

[(0,1,2,3,4,5,6,7,8.0,9.0,'f','fo','foo','foob','fooba','foobar')
    for i in range(10**6)]

:-)


> FYI, newpb is scheduled to have an opportunistic string-caching scheme in
> which any string that gets sent over the wire more than a couple times gets
> replaced by a VOCAB token with a number. The idea is to compress all the
> standard internal PB sequences (like "list", "tuple", "my-reference", "call")
> into short two-byte tokens, and for the sender to decide which strings get
> tokenized these ways (there will be a special sequence that adds/removes
> things from the receiver's mapping).

That would be great, in my case the total number of different string is
rather small. But instead of spending effort on that, you may want to
consider ways of transparently compressing and decompressing the serialized
stream with a standard algorithm.

-- 
Nicola Larosa - nico at tekNico.net

I love Apache, but in the same way I love my wife: with some trepidation.
Fast and stable, flexible and reliable, but make one little syntax error
and you can lose your ass. -- legLess on Slashdot, July 2005





More information about the Twisted-Python mailing list