[Twisted-web] Limit the simultaneous twisted.web.client.downloadPage requests

Igor Katson descentspb at gmail.com
Sat Oct 24 09:20:49 EDT 2009


I am a newbie in twisted, sorry if my question sounds awkward.

I have written a pretty simple recursive page downloader, which parses 
an html, extracts all the needed links from it, and starts dowloading 
them. The links are the videofiles, so they are pretty large. The 
problem is, that the downloader works TOO FAST :) I want to set 
something like the global bandwidth limit or the maximum limit of 
concurrently downloading files.

I am using the twisted.web.client.downloadPage to download the files and 
using the Deferred, that it returns.
I can't understand how to make it still return a Deferred, corresponding 
to that file, but not start downloading right away, but instead start 
downloading it on some kind of event (make a manger-like wrapper for 
that function).

So I want the code to still look simple like this:

for link in links:
    d = downloadPage_limited(link, filename)

And the wrapper(function downloadPage_limited) will manage the amount of 
concurrent downloads, and will still return the Deferred, which will be 
returned by twisted.web.client.downloadPage.

Is my idea about a "wrapper" practical and what's the general way to 
write it?
On which event is it better to decrement the counter of the amount 
currently downloading files?

Hope it is clear enough.

Thanks in advance,
Igor Katson.

More information about the Twisted-web mailing list