[Twisted-Python] Scalability of an rss-aggregator

Valentino Volonghi aka Dialtone dialtone at aruba.it
Wed Mar 31 08:03:23 EST 2004

Andrew Bennetts wrote:

>Hmm, it's unlikely to be DNS lookups causing it, then.
>We need some way to narrow down where it's happening.  There are a few
>options I can think of, but they're all a bit heavyweight...
>  - Use strace to get some idea what it's doing
>  - Use the --spew option of twistd (or manually install the spewer with
>    "from twisted.python.util import spewer; sys.settrace(spewer)")
>  - Use gdb to attach the process, then and look at the backtrace there.
>(You can apparently get the python backtrace in gdb by putting this macro in
>your .gdbinit:
>define ppystack
>    while $pc < Py_Main || $pc > Py_GetArgcArgv
>        if $pc > eval_frame && $pc < PyEval_EvalCodeEx
>            set $__fn = PyString_AsString(co->co_filename)
>            set $__n = PyString_AsString(co->co_name)
>            printf "%s (%d): %s\n",  $__fn, f->f_lineno, $__n
>        end
>        up-silently 1
>    end
>    select-frame 0
>But I've never tried this...
>Is it possible that feedparser is hanging on trying to parse that feed?
>Perhaps trying putting print statements before and after the
>feedparser.parse call.
Maybe the problem is there, but then I wouldn't answer the other question:
"Why does it takes at most 30 second to parse all the remaining 350 feeds?"
There is no network activity after the unlocking "Ctrl+C"...
Gotta investigate then.

>>>You should be able to test this theory by installing Twisted's resolver:
>>>  from twisted.names import client
>>>  reactor.installResolver(client.createResolver())
>>>client.createResolver makes a resonable effort to use your system's DNS
>>>configuration (by looking at /etc/resolve.conf on posix systems, for
>>>example), so it should work without any special arguments.
>>ok, it changes into a totally non-working script :)
>>I get a lot of these:
>>[Failure instance: Traceback: exceptions.TypeError, unsubscriptable object
>Ouch.  I wonder how that bug crept in?  The twisted.names code is expecting a
>sequence of timeouts (to re-issue the query with, until failing at last), but
>twisted.internet is only giving it a single integer.  I've filed a bug
>report for this: http://twistedmatrix.com/bugs/issue570, if you care :)
Sure :), this is the second bug for me, the first one was a 
documentation bug, the finger tutorial has some errors :).

>Absolutely.  I've heard similar complaints about straw, and I've been hoping
>some keen person would apply Twisted to fix the problem :)
That was my hope too, but since a friend of mine asked for an 
rss-aggregator made with twisted... I realized that someone
wants me to be that keen person. Oooohhhh Which thing has the fate 
classified for me? Ooooooohhhhh :P

Valentino Volonghi aka Dialtone
Linux User #310274, Gentoo Proud User
X Python Newsreader developer

More information about the Twisted-Python mailing list