[Twisted-Python] Scalability of an rss-aggregator

Andrew Bennetts andrew-twisted at puzzling.org
Wed Mar 31 05:39:27 EST 2004


On Wed, Mar 31, 2004 at 09:33:58AM +0200, Valentino Volonghi aka Dialtone wrote:
> Hi all,
> attached you will find my rss-aggregator made with twisted.
> 
> It's really fast although when I tried with 745 feeds I got some problems.
> When the download reached 300 parsed feeds (more or less) it locked till 
> I pressed Ctrl+C and then it
> processed the remaining 340 feeds in less than 30 seconds... I think 
> that my design has at least an issue
> but  I cannot find it so easily and I hope someone on this list can help 
> me to improve it.

By default, Twisted uses the platform name resolver, which is blocking.
Perhaps a non-existent domain is causing gethostbyname to block?

You should be able to test this theory by installing Twisted's resolver:

    from twisted.names import client
    reactor.installResolver(client.createResolver())

client.createResolver makes a resonable effort to use your system's DNS
configuration (by looking at /etc/resolve.conf on posix systems, for
example), so it should work without any special arguments.

> The "script" is heavily commented.
> 
> BTW When it finishes (with all 740 feeds) it reports an awesome 330 
> seconds which is an impressive time, less than half a second
> for each feed, and It downloads more than 50Mb of feeds from the net 
> (with 745 feeds to download).

Nice!

> Thx for your help.

Not a problem.  Let us know if it helps.

-Andrew.





More information about the Twisted-Python mailing list