[Twisted-Python] Non-blocking http client?

Andreas Kostyrka andreas at kostyrka.org
Sat Dec 13 19:21:51 EST 2008


On Sat, Dec 13, 2008 at 10:08:43AM -0800, Erik Wickstrom wrote:
> Hi all,
> 
> I have an application that is doing some web spidering.  Right now I'm
> using urllib to retrieve the URLs, but it is painfully slow.  I was
> wondering if it's feasible to swap out urllib with a twisted client
> that uses deferds so I can process urls in a more "parallel" fashion?
> 
> I've done a bunch of Googleing, but I haven't come across anything
> that I can use as a drop in replacement.   If you can point me in the
> right direction I'd really appreciate it!

Well, twisted.web.client is your path to DoS fame. Actually, if you are intenting to use it for
spidering 3rd party websites, I'd recommend a small dispatcher object that limits the number of concurrent connections per target 
server.

Andreas

> 
> Thanks for your help!
> Erik
> 
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python




More information about the Twisted-Python mailing list