[Twisted-Python] Re: Memory size of Deferreds

Wed May 21 03:56:30 EDT 2008

Michał Pasternak <michal.dtz at gmail.com> writes:

>>   def add(x, y):
>>       sum = gatherResults([x, y])
>>       sum.addCallback(lambda results: results[0] + results[1])
>>       return sum
>> 
>> Multiplication is similar, but requires network communication --
>> that is why it returns a Deferred.
>
> I don't think that there's anything wrong with this example in terms
> of being written using Twisted Python.

Okay, thanks for looking at it!

>> The important point is that as much as possible is done in
>> parallel, that all operations run as soon as their operands are
>> available. Twisted made this extremely easy thanks to the
>> DeferredList and Deferred abstraction.
>
> This is the point where I think, that you answered your question. If
> you do stuff in parallel, you use more Deferreds and more memory.

Yes, I guess you are right... :-) The design tries to do everything at
once with as much parallism as possible. But since bandwidth is
limited, I guess I could achieve the same throughput by scheduling
things in a slower rate. Maybe using something like this:

  class map_parallel(Deferred):
      def __init__(self, iterable, func, max):
          Deferred.__init__(self)

          self.iter = iterable
          self.func = func
          self.running = 0

          for _ in range(max):
              self.running += 1
              self.run_next(None)

      def run_next(self, arg):
          try:
              e = self.iter.next()
              e.addCallback(self.func)
              e.addCallback(self.run_next)
          except StopIteration:
              # The iterator is exhausted.
              self.running -= 1
              if self.running == 0 and not self.called:
                  # All callbacks have finished.
                  self.callback(None)
          # Pass the argument through unchanged.
          return arg

where map_parallel will map a function over a list of Deferreds, but
it will only run up to max function applications in parallel. I will
experiment with it, and if it works well, then it would be cool to
have a class like this in Twisted.

> I will repeat myself now: I would rather try to find other places to
> optimize VIFF, than reducing memory footprint of Deferred. I don't
> have a CS degree, I'm not a mathematician, but from what you wrote
> it seems that maybe it would be worth to try implementing something
> stack-based (like RPN), than the tree approach you seem to be doing
> now - of course if having a stack is something you can have using
> secure multiparty calculations.

Thanks for the ideas! You could certainly do what you suggest,
changing the schedule like that is okay from a security point of view.
And if I can delay allocating memory until needed by doing so, then it
will definitely be interesting to try.

-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multi-Party Computation) to Python. See: http://viff.dk/.