[Twisted-Python] Multicast XMLRPC
p.mayers at imperial.ac.uk
Sat Aug 26 07:39:20 EDT 2006
> Now let me address the issue of TCP. It is a pretty heavy protocol to
> use. It takes a lot of resources on the sender and target and can take
> some time to establish a connection. Opening a 1000 or more sockets
> consumes a lot of resources in the underlying OS and in the Twisted client!
People keep trying to help you, and you keep repeating yourself. From
what I can gather:
You *need* a relatively lightweight group communication method. My
advice would be to investigate a message bus - see recent posts on this
mailing list. "Spread" at www.spread.org and ActiveMQ (via the simple
text-over-tcp-based STOMP protocol). Reports are that both can (under
the right conditions) execute many thousands of group messages per second.
Failing that, Glyph has hinted at another approach. You could elect a
small number (~1%) of your nodes as "proxies" so that as well as being
clients, they act as intermediaries for messages. This is a simple form
of overlay network, which you also stated you didn't want to use - lord
knows why. People use these techniques for a reason - they work.
You *want* (have decided you want) a reliable multicast protocol over
which you'll layer a simple RPC protocol. RMT (reliable multicast
transport) is as yet an unsolved problem. It is VERY VERY hard. None
exist for Twisted, to the best of my knowledge. I would be willing to
bet money that, for "thousands" of nodes, the overhead of implementing
such a protocol (in Python, one presumes) would exceed the overhead of
just using TCP. If you had said "hundreds of thousands" of nodes, well,
that would be different.
If you want to knock an RMT up based on the assumption you won't drop
packets, then be my guest, but I would suggest that if you *really*
believe multicast is that reliable, then your experience of IP multicast
networks has been a lot more rosy than mine, and I run a very large one.
"reliable multicast" into google would be a good start - there are some
good RFCs produced the the rmt IETF working group.
> If I use TCP and stick to the serial, synchronized semantics of RPC,
> doing one call at a time, I have only a few ways to solve the problem.
> Do one call at a time, repeat N times, and that could take quite a
> while. I could do M spawnProcesses and have each do N/M RPC calls. Or I
> could use M threads and do it that way. Granted I have M sockets open at
> a time, it is possible for this to take quite a while to execute.
> Performance would be terrible (and yes I want an approach that has good
> to very good performance. After all who would want poor to terrible
Knuth and his comments on early optimisation apply here. Have you tried
it? You might be surprised.
I have some Twisted code that does SNMP to over a thousand devices. This
is, obviously, unicast UDP. The throughput is very high. A simple
ACK-based sequence-numbered UDP unicast will very likely scale to
thousands of nodes.
> So I divided the problem down to two parts. One, can I reduce the amount
> of traffic on the invoking side of the RPC request? Second, is how to
> deal with the response. Obviously I have to deal with the issue of
> failure, since RPC semantics require EXACTLY-ONCE.
How many calls per second are you doing, and approximately what volume
of data will each call exchange?
You seem inflexible about aspects of the design. If if were me, I'd
abandon RPC semantics. Smarter people than anyone here have argued
convincingly against making a remote procedure call look anything like a
local one, and once you abandon *that*, RPCs look like message exchanges.
> That gets me to the multicast or broadcast scheme. In one call I could
> get the N processors to start working. Now I just have to solve the
> other half of the problem: how to get the answers returned without
> swamping the network or how to detect when I didn't get an answer from a
> processor at all.
> That leads me to the observation that on an uncongested ethernet I
> almost always have a successful transmission. This means I have to deal
Successful transmission is really the easy bit for multicast. There is
IGMP snooping, IGMP querier misbehaviour, loss of forwarding on an
upstream IGP flap, flooding issues due to global MSDP issues, and so forth.
> with that issue and a few others. Why do I care? Because I believe I can
> accomplish what I need - get great performance most of the time, and
> only in a few instances have to deal with do the operation over again.
> This is a tough problem to solve. I am not sure of the outcome but I am
> sure that I need to start somewhere. What I know is that it is partly
> transport and partly marshalling. The semantics of the call have to stay
> fixed: EXACTLY-ONCE.
If you MUST have EXACTLY-ONCE group communication semantics, you should
use a message bus.
More information about the Twisted-Python