[Twisted-Python] How to use ampoule?

Fri Feb 20 18:07:58 EST 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Feb 20, 2009, at 2:08 AM, Chris wrote:

> Hi everyone,
> I am using twisted to build a server,and the computing request maybe
> costs lot of cpu resources.I have checked the maillist,it seems I can
> use ampoule plugin to create another process ,I have checked the
> website,and downloaded the source code.It seems there is no any  
> document
> for it.I did find a example dir,after checking the code,I was
> confused.There should be two files,one for client and one for  
> server,am
> I right?the client will send the request to the server(could be in
> another machine?),and the server responses.but I can't find anything
> which matches what I thought.Can anyone explain a little bit to me? 
> Or if
> there is some code that would be better.

I don't have much time to write documentation, I basically spend most of
the time in documenting the code itself, in testing it and I attached  
some of
the examples that one can use to learn the basics by himself but...

You don't need 2 files, you need the code you want to run on both sides
of the connection, Ampoule doesn't support shipping functions, although
one might implement it on top of what ampoule offers, albeit not  
recommended.

AMPoule uses AMP as a communication protocol between the caller and the
process pool, this means that they talk to each other using  
twisted.protocols.amp
and that for the abstraction sake they work as 2 separate networked  
services
that make RPC calls to each other. Again this also means that in order  
to
develop a process pool you will use AMP abstractions and classes on top
of what AMPoule offers by default.

Let's look at the simplest example: examples/pid.py

What you need is a set of commands that a child process should be able
to answer to, in pid.py this set is made of just a single command and  
that
is:

from twisted.protocols import amp

class Pid(amp.Command):
     response = [("pid", amp.Integer())]

Once you define a command that you want to be able to run in a process
pool you need to define a child process that is able to answer to the  
Pid
command. Defining this also defines what every worker in the process
pool will be able to answer to.

In the example this is done with the following lines:

from ampoule import child

class MyChild(child.AMPChild):
     @Pid.responder
     def pid(self):
         import os
         return {"pid": os.getpid()}

We define a child of the process pool called MyChild and using AMP
machinery we set the method pid as the responder for the command
Pid. Unsurprisingly this command gets the pid and returns it.

Now we need to run this code and use it. So far we have defined
the server (ProcessPool) side of things.

[NOTE: In order to run everything in a single file using "python  
filename.py"
to start it we need to hack around python's import system, this is why
util.mainpoint exist, only to allow the script to import itself,  
you'll notice
that a script's name when started with "python filename.py" is not
filename but __main__.]

Here's the code

@util.mainpoint
def main(args):
     import sys
     from twisted.internet import reactor, defer
     from twisted.python import log
     log.startLogging(sys.stdout)

     from ampoule import pool

     @defer.inlineCallbacks
     def _run():
         pp = pool.ProcessPool(MyChild, min=1, max=1)
         yield pp.start()
         result = yield pp.doWork(Pid)
         print "The Child process PID is:", result['pid']
         yield pp.stop()
         reactor.stop()

     reactor.callLater(1, _run)
     reactor.run()

Besides all the standard twisted imports and 'boilerplate' code
the core of the client is inside the _run function.

It creates the process pool telling it to use MyChild as a specification
for its children, we also tell it that the minimum size of the pool is 1
as well as the maximum size of the pool. Then the code proceeds
to start it and once it's tarted we can submit commands and this is
done with:

result = yield pp.doWork(Pid)

or

result = yield pp.callRemote(Pid)

The script then prints the result, stops the pool and exits.

Strictly speaking the client of the process pool only needs to know
the commands that he wants to execute, and which pool (if there
is more than one) can execute it.

This however is only true when the server is started separately from
the client using "twistd ampoule" plugin. If your clients starts the
pool by itself it needs to know also the class that defines the children
protocol.

The twistd ampoule plugin does nothing more than taking a child and
parent class (the parent class defines what the master of the process
pool can speak so that children can make calls against it if needed)
and using some hacks exposes the same exact interface across the
network using the parameters that you pass in the cli.

> And the twisted server will receive lot of binary data from the
> client,if I use ampoule,I have to send the same data to the ampoule
> server again(now the twisted server acts as the ampoule client).Is the

Correct. I don't see any other way to do this except maybe if you have
a distributed filesystem, in which case you might want to just save
the data on the server and then make the right calls on the remote
pools that have access to this distributed filesystem so that they can
process the shards that they have access to.

> ampoule suitable for such a kind of task? Or I should just use twisted
> process module?

Using the process module is not really different, the only issue you
might find is related to the limit imposed by AMP of 64KB of data
that can be transmitted in a single call to a given child. So for  
example

pool.callRemote(Command, argument="I'm a bit string of more than 64KB")

won't work but:

pool.callRemote(Command, argument="I'm a bit string of LESS than 64KB")

will. There are multiple ways in which you can solve this issue but  
essentially
I wouldn't use ampoule to transport big quantities of data, you'd  
better use
a distributed filesystem or actually an http server where you store data
from the server and each worker grabs data from.

> I don't know my understanding is correct or not,please correct me.

It seems you got it right.

HTH

- --
Valentino Volonghi aka Dialtone
Now running MacOS X 10.5
Home Page: http://www.twisted.it
http://www.adroll.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkmfN88ACgkQ9Llz28widGXE9ACgk2OAlXK0cVP5/5tINoFAD70C
Zc8Anig2L8GCNklG83a6la4x/hksFozW
=SS7Z
-----END PGP SIGNATURE-----