[Twisted-Python] Scalability of an rss-aggregator

Valentino Volonghi aka Dialtone dialtone at aruba.it
Wed Mar 31 08:03:23 EST 2004


Andrew Bennetts wrote:

>Hmm, it's unlikely to be DNS lookups causing it, then.
>
>We need some way to narrow down where it's happening.  There are a few
>options I can think of, but they're all a bit heavyweight...
>
>  - Use strace to get some idea what it's doing
>  - Use the --spew option of twistd (or manually install the spewer with
>    "from twisted.python.util import spewer; sys.settrace(spewer)")
>  - Use gdb to attach the process, then and look at the backtrace there.
>
>(You can apparently get the python backtrace in gdb by putting this macro in
>your .gdbinit:
>
>define ppystack
>    while $pc < Py_Main || $pc > Py_GetArgcArgv
>        if $pc > eval_frame && $pc < PyEval_EvalCodeEx
>            set $__fn = PyString_AsString(co->co_filename)
>            set $__n = PyString_AsString(co->co_name)
>            printf "%s (%d): %s\n",  $__fn, f->f_lineno, $__n
>        end
>        up-silently 1
>    end
>    select-frame 0
>end
>
>But I've never tried this...
>)
>
>Is it possible that feedparser is hanging on trying to parse that feed?
>Perhaps trying putting print statements before and after the
>feedparser.parse call.
>
>  
>
Maybe the problem is there, but then I wouldn't answer the other question:
"Why does it takes at most 30 second to parse all the remaining 350 feeds?"
There is no network activity after the unlocking "Ctrl+C"...
Gotta investigate then.

>>>You should be able to test this theory by installing Twisted's resolver:
>>>
>>>  from twisted.names import client
>>>  reactor.installResolver(client.createResolver())
>>>
>>>client.createResolver makes a resonable effort to use your system's DNS
>>>configuration (by looking at /etc/resolve.conf on posix systems, for
>>>example), so it should work without any special arguments.
>>>
>>>      
>>>
>>ok, it changes into a totally non-working script :)
>>
>>I get a lot of these:
>>[Failure instance: Traceback: exceptions.TypeError, unsubscriptable object
>>/usr/lib/python2.3/site-packages/twisted/internet/defer.py:313:_runCallbacks
>>/usr/lib/python2.3/site-packages/twisted/names/resolve.py:44:__call__
>>/usr/lib/python2.3/site-packages/twisted/names/common.py:36:query
>>/usr/lib/python2.3/site-packages/twisted/names/common.py:104:lookupAllRecords
>>/usr/lib/python2.3/site-packages/twisted/names/client.py:266:_lookup
>>/usr/lib/python2.3/site-packages/twisted/names/client.py:214:queryUDP
>>]
>>    
>>
>
>Ouch.  I wonder how that bug crept in?  The twisted.names code is expecting a
>sequence of timeouts (to re-issue the query with, until failing at last), but
>twisted.internet is only giving it a single integer.  I've filed a bug
>report for this: http://twistedmatrix.com/bugs/issue570, if you care :)
>  
>
Sure :), this is the second bug for me, the first one was a 
documentation bug, the finger tutorial has some errors :).

>Absolutely.  I've heard similar complaints about straw, and I've been hoping
>some keen person would apply Twisted to fix the problem :)
>  
>
That was my hope too, but since a friend of mine asked for an 
rss-aggregator made with twisted... I realized that someone
wants me to be that keen person. Oooohhhh Which thing has the fate 
classified for me? Ooooooohhhhh :P

-- 
Valentino Volonghi aka Dialtone
Linux User #310274, Gentoo Proud User
X Python Newsreader developer
http://sourceforge.net/projects/xpn/





More information about the Twisted-Python mailing list