[Twisted-Python] Scalability of an rss-aggregator
Valentino Volonghi aka Dialtone
dialtone at aruba.it
Wed Mar 31 08:03:23 EST 2004
Andrew Bennetts wrote:
>Hmm, it's unlikely to be DNS lookups causing it, then.
>
>We need some way to narrow down where it's happening. There are a few
>options I can think of, but they're all a bit heavyweight...
>
> - Use strace to get some idea what it's doing
> - Use the --spew option of twistd (or manually install the spewer with
> "from twisted.python.util import spewer; sys.settrace(spewer)")
> - Use gdb to attach the process, then and look at the backtrace there.
>
>(You can apparently get the python backtrace in gdb by putting this macro in
>your .gdbinit:
>
>define ppystack
> while $pc < Py_Main || $pc > Py_GetArgcArgv
> if $pc > eval_frame && $pc < PyEval_EvalCodeEx
> set $__fn = PyString_AsString(co->co_filename)
> set $__n = PyString_AsString(co->co_name)
> printf "%s (%d): %s\n", $__fn, f->f_lineno, $__n
> end
> up-silently 1
> end
> select-frame 0
>end
>
>But I've never tried this...
>)
>
>Is it possible that feedparser is hanging on trying to parse that feed?
>Perhaps trying putting print statements before and after the
>feedparser.parse call.
>
>
>
Maybe the problem is there, but then I wouldn't answer the other question:
"Why does it takes at most 30 second to parse all the remaining 350 feeds?"
There is no network activity after the unlocking "Ctrl+C"...
Gotta investigate then.
>>>You should be able to test this theory by installing Twisted's resolver:
>>>
>>> from twisted.names import client
>>> reactor.installResolver(client.createResolver())
>>>
>>>client.createResolver makes a resonable effort to use your system's DNS
>>>configuration (by looking at /etc/resolve.conf on posix systems, for
>>>example), so it should work without any special arguments.
>>>
>>>
>>>
>>ok, it changes into a totally non-working script :)
>>
>>I get a lot of these:
>>[Failure instance: Traceback: exceptions.TypeError, unsubscriptable object
>>/usr/lib/python2.3/site-packages/twisted/internet/defer.py:313:_runCallbacks
>>/usr/lib/python2.3/site-packages/twisted/names/resolve.py:44:__call__
>>/usr/lib/python2.3/site-packages/twisted/names/common.py:36:query
>>/usr/lib/python2.3/site-packages/twisted/names/common.py:104:lookupAllRecords
>>/usr/lib/python2.3/site-packages/twisted/names/client.py:266:_lookup
>>/usr/lib/python2.3/site-packages/twisted/names/client.py:214:queryUDP
>>]
>>
>>
>
>Ouch. I wonder how that bug crept in? The twisted.names code is expecting a
>sequence of timeouts (to re-issue the query with, until failing at last), but
>twisted.internet is only giving it a single integer. I've filed a bug
>report for this: http://twistedmatrix.com/bugs/issue570, if you care :)
>
>
Sure :), this is the second bug for me, the first one was a
documentation bug, the finger tutorial has some errors :).
>Absolutely. I've heard similar complaints about straw, and I've been hoping
>some keen person would apply Twisted to fix the problem :)
>
>
That was my hope too, but since a friend of mine asked for an
rss-aggregator made with twisted... I realized that someone
wants me to be that keen person. Oooohhhh Which thing has the fate
classified for me? Ooooooohhhhh :P
--
Valentino Volonghi aka Dialtone
Linux User #310274, Gentoo Proud User
X Python Newsreader developer
http://sourceforge.net/projects/xpn/
More information about the Twisted-Python
mailing list