[Twisted-Python] running 1,000,000 tasks, 40 at-a-time

Jason Rennie jrennie at gmail.com
Wed Oct 26 10:02:01 EDT 2011


The background:

I've been using DeferredSemaphore and DeferredList to manage the running of
tasks with a resource constraint (only so many tasks can run at the same
time).  This worked great until I tried to use it to manage millions of
tasks.  Simply setting them up to run (DeferredSemaphore.run() calls) took
appx. 2 hours and used ~5 gigs of ram.  This was less efficient than I
expected.  Note that these numbers don't include time/memory for actually
running the tasks, only time/memory to set up the running of the tasks.
 I've since written a custom task runner that has uses comparatively little
setup time/memory by adding a "manager" callback to each task which starts
additional tasks as appropriate.

My questions:

   - Is the behavior I'm seeing expected?  i.e. are DS/DL only recommended
   for task management if the # of tasks not too large?  Is there a better way
   to use DS/DL that I might not be thinking of?
   - Is there a Twisted pattern for managing tasks efficiently that I might
   be missing?

Thanks,

Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20111026/e662745a/attachment.htm 


More information about the Twisted-Python mailing list