[Twisted-Python] twisted thumbnail server

Paul Wiseman poalman at gmail.com
Fri Nov 2 12:50:03 EDT 2012


On 2 November 2012 16:13, Phil Mayers <p.mayers at imperial.ac.uk> wrote:

> On 02/11/12 15:42, Paul Wiseman wrote:
> > I hope this will be an easy question for some of you guys :)
> >
> > I'm trying to set up a simple server which will accept requests over GET
> > to create a thumbnail for an image, and server it back as the response.
> >
> > The images are stored in two S3 buckets, the originals are in one bucket
> > (store), and the generated thumbnails are stored in another (thumb) as a
> > cache so that the work doesn't need to be repeated.
> >
> > Currently I'm checking if the thumbnail already exists in the thumb
> > bucket. I'm redirecting the request if it is or if not I'm downloading
> > the image from store, generating the thumb using PIL, uploading the
> > thumbnail to the thumb bucket and then redirecting the request.
>
> This isn't a criticism, but I trust you are aware of the implications
> and problems of doing work in threads?




I think so. I understand the whole idea of twisted is to schedule tasks in
an async way in a single main thread. I usually use quite a lot of threads
in my code and I'm just learning about this new async style of coding. I've
tried to avoid using threads as much as I can here but didn't think I could
get away from them based on the fact that PIL is blocking.


> FWIW we usually use a child process pool for intensive tasks; this has
> the advantage you can sensibly kill a long-lived child (just kill the
> process) and you side-step the lack of concurrency in the python
> interpreter.
>

> [In this case, I'd just start up a bunch of python interpreters using a
> ProcessProtocol and use a simple request/response command protocol on
> stdin/stdout - the child interpreters can be non-Twisted processes able
> to block on PIL operations]
>
>
I'd love to get this working using a processes pool rather than a thread
pool (I spent quite a lot of time trying to figure out how to do it in
twisted but haven't yet worked out how). As this server will be CPU bound
this will hopefully get more throughput and also side-step the memory leak
in PIL that I believe I'm seeing (although on second thought maybe not, the
over head of starting a new python interpreter each time wouldn't be
viable).

Are there any examples of how to use a process pool?


> If you really do want threads, is there any reason to not use the
> Twisted threadpool stuff?
>
>
I wasn't aware of a twisted thread pool, I think I've only come across
deferToThead, which I imagined was using a threadpool. (if so if there a
way to control the size?)


> It's often a personal/style choice, but I don't use StringIO for large
> volumes of data personally (not Twisted-specific).
>
>
I rarely use it either, but I needed a way to get the data into a file type
object for PIL without putting the data to disk. How would you recommend I
get the data between download / PIL / upload?


> I'm sure someone will mention tests ;o)


>

tests are something I've often, guiltily, neglected- maybe I should start
:) I wouldn't really know where to start writing tests for anything more
than trivial functions. I guess just have images and thumbnails that I know
are correct for that request, send them to the server and see if they
return a thumbnail that matches? something along those lines?


> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20121102/5ade186a/attachment.htm 


More information about the Twisted-Python mailing list