[Twisted-web] How to use the HTTPClientFactory connect one time and
get more than 1000 page?
Cheney Lee
ironpythonster at gmail.com
Wed Jan 23 03:46:50 EST 2008
Hi,
it is my first time use twisted .
i want to use the a function pass a url then get the web page
the code as :
// *some code call getPage*
*while id <= 10000000:
getPage("**http://www.mywebsite.com/News.aspx?ID="+str(id*<http://www.mywebsite.com/News.aspx?ID="+str(id>
*))
id += 1
*//******************************************
*the getPage is definition in twisted.web.client*
*def getPage(url, contextFactory=None, *args, **kwargs):
"""Download a web page as a string.*
* Download a page. Return a deferred, which will callback with a
page (as a string) or errback with a description of the error.*
* See HTTPClientFactory to see what extra args can be passed.
"""
scheme, host, port, path = _parse(url)
factory = HTTPClientFactory(url, *args, **kwargs)
if scheme == 'https':
from twisted.internet import ssl
if contextFactory is None:
contextFactory = ssl.ClientContextFactory()
reactor.connectSSL(host, port, factory, contextFactory)
else:
reactor.connectTCP(host, port, factory)
return factory.deferred*
--------------------------------------------------------------------------------
*Question:
*for the getPage function,if use it to get 10000 page ,it would open/close
connection 10000 times,it is a very large cost。
So ,any body can give me some advice?creat a class inherit from
HTTPPageGetter(as a protocol class)
or HTTPClientFactory?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-web/attachments/20080123/c356ce29/attachment.htm
More information about the Twisted-web
mailing list