[Twisted-Python] etag and last-modified

Jeff Grimmett grimmtooth at gmail.com
Fri Nov 5 10:19:03 EST 2004

On Fri, 05 Nov 2004 10:19:40 +0100, Jacob Friis <lists at debpro.webcom.dk> wrote:
> I have a script that downloads multiple rss/atom feeds via Feedparser.
> The script uses twisted.internet but the developer tells me there is no
> way to use etag and last-modified with twisted.

I have been working an angle on this as well and have given up for the
time being in terms of integrating a Twisted-based 'connector' in a
way that urllib2 could use it - which is the best way of doing it. If
you do it that way, then the interface is transparent.

Problem is, urllib2 documentation is very confusing. I know, that
probably sounds wierd on a twisted mailing list, but there you have
it. :-)

What I have now is that I use the classes in twisted.web.client to
pull the page down, then feed it to feedparser. That means that I have
to handle the headers and etag/last-modified stuff myself. But if you
look at the code for feedparser, it's not that complicated.  I do
regret having to duplicate code, but it can't be helped unless Mark
expands his interface a little.

And ideally, I'd prefer to pass a twisted connection to urllib2 as a
handler anyway.

I'm attaching a small proof of concept for the non-urllib2
implementation I've been playing around with. It's very basic.

> Instead I'll let Feedparser do the download and use twisted for threads.
> What is the maximum pool size I can use?

Screw that. Been there, done that, it sucks. I say again, IT SUCKS.
Did I mention it sucks? PC performance seems to degrade exponentially
as you fire off more and more feedparser-threads.  I've done it. Even
with a modest throttle setting of 15 simultaneous connections, my
system was chewing itself to bits. Granted this was Win32, but on the
other hand I've established many times that many connections through
the twisted interface, and seen virtually no indication that anything
wa going on at all - system was smooth as glass.

So there's the thing. Do a little extra work, and make it work RIGHT,
or do a little extra work, and make it a bad user experience.

If you hate your users, select option #2.



More information about the Twisted-Python mailing list