[Twisted-Python] Scalability of an rss-aggregator

Valentino Volonghi aka Dialtone dialtone at aruba.it
Wed Mar 31 06:27:49 EST 2004


Andrew Bennetts wrote:

>On Wed, Mar 31, 2004 at 09:33:58AM +0200, Valentino Volonghi aka Dialtone wrote:
>  
>
>>Hi all,
>>attached you will find my rss-aggregator made with twisted.
>>
>>It's really fast although when I tried with 745 feeds I got some problems.
>>When the download reached 300 parsed feeds (more or less) it locked till 
>>I pressed Ctrl+C and then it
>>processed the remaining 340 feeds in less than 30 seconds... I think 
>>that my design has at least an issue
>>but  I cannot find it so easily and I hope someone on this list can help 
>>me to improve it.
>>    
>>
>
>By default, Twisted uses the platform name resolver, which is blocking.
>Perhaps a non-existent domain is causing gethostbyname to block?
>
>  
>
Uhmm... dunno, but I tried to remove the 'locking' feed-source and it 
didn't change.

>You should be able to test this theory by installing Twisted's resolver:
>
>    from twisted.names import client
>    reactor.installResolver(client.createResolver())
>
>client.createResolver makes a resonable effort to use your system's DNS
>configuration (by looking at /etc/resolve.conf on posix systems, for
>example), so it should work without any special arguments.
>
>  
>
ok, it changes into a totally non-working script :)

I get a lot of these:
[Failure instance: Traceback: exceptions.TypeError, unsubscriptable object
/usr/lib/python2.3/site-packages/twisted/internet/defer.py:313:_runCallbacks
/usr/lib/python2.3/site-packages/twisted/names/resolve.py:44:__call__
/usr/lib/python2.3/site-packages/twisted/names/common.py:36:query
/usr/lib/python2.3/site-packages/twisted/names/common.py:104:lookupAllRecords
/usr/lib/python2.3/site-packages/twisted/names/client.py:266:_lookup
/usr/lib/python2.3/site-packages/twisted/names/client.py:214:queryUDP
]

>>BTW When it finishes (with all 740 feeds) it reports an awesome 330 
>>seconds which is an impressive time, less than half a second
>>for each feed, and It downloads more than 50Mb of feeds from the net 
>>(with 745 feeds to download).
>>    
>>
>
>Nice!
>
>  
>
Yup, was going to ask for my script to be used instead of asyncore to 
Straw developers.
Straw has a lot of problems with 200 feeds ie resets the connection and 
such. This would be an
awesome improvement.

-- 
Valentino Volonghi aka Dialtone
Linux User #310274, Gentoo Proud User
X Python Newsreader developer
http://sourceforge.net/projects/xpn/





More information about the Twisted-Python mailing list