[Twisted-Python] clientfactory cleanup slow-down (after many http requests)

Randomcoder randomcoder1 at gmail.com
Sat Aug 6 04:48:23 MDT 2016


Hello,

I've been working on a small Twisted program.
The program makes HTTP requests to a large number of feeds.
Twisted is used to speed up the entire process.
After the feeds are fetched, they're parsed. Finally they should be
written to a database (to simplify the code, that part is left out).

Feeds are fetched in parallel using gatherResults, and a batch is
built. Then all batches are again gathered into a set of batches,
a DeferredList is built out of those. A semaphore controls both the
batch-level list of deferreds, and a semaphore controls the entire batch
list deferred.

Currently, the program works ok on 100-150 feeds, and BATCH_SIZE between
5 and 20.

However, I notice the program starts to hang for a long time, when the
number of feeds goes over 150-200.

To be more precise, at the end of running the program, messages
like these are printed, but the program seems to not be very active:

    Stopping factory <twisted.web.client._HTTP11ClientFactory instance at 0x7f0b7d5f3908>

It seems like this is the cleanup phase.

I've read what I could find on the topic. I wasn't able to make progress
on it, so I'm posting to the mailing list to ask if someone has encountered this
before. Maybe it's a common pitfall or issue that other people have also
bumped into.

Thanks
-------------- next part --------------
http://mauveweb.co.uk/rss.xml "python"
http://blog.hownowstephen.com/rss "python"
http://blog.codepainters.com/feed/ "python"
http://chase-seibert.github.io/blog/atom.xml "python"
http://www.lshift.net/blog/feed/ "python"
http://django-planet.com/feeds/main/rss/ "python"
http://eflorenzano.com/atom.xml "python"

-------------- next part --------------
A non-text attachment was scrubbed...
Name: collect.py
Type: text/x-python
Size: 6818 bytes
Desc: not available
URL: <http://twistedmatrix.com/pipermail/twisted-python/attachments/20160806/fbf5912d/attachment.py>


More information about the Twisted-Python mailing list