[Twisted-Python] Multicast XMLRPC

Sat Aug 26 09:08:49 EDT 2006

Phil Mayers wrote:
> Chaz. wrote:
>>
>> Now let me address the issue of TCP. It is a pretty heavy protocol to 
>> use. It takes a lot of resources on the sender and target and can take 
>> some time to establish a connection. Opening a 1000 or more sockets 
>> consumes a lot of resources in the underlying OS and in the Twisted 
>> client!
> 
> People keep trying to help you, and you keep repeating yourself. From 
> what I can gather:
> 
> You *need* a relatively lightweight group communication method. My 
> advice would be to investigate a message bus - see recent posts on this 
> mailing list. "Spread" at www.spread.org and ActiveMQ (via the simple 
> text-over-tcp-based STOMP protocol). Reports are that both can (under 
> the right conditions) execute many thousands of group messages per second.
> 

I started out using Spread some time ago (more than 2 years ago). The 
implementation was limited to a hundred or so nodes (that is in the 
notes on the spread implementation). Secondly it isn't quite so 
lightweight as you think (I've measured the performance).

It is a very nice system but when it gets to 1000s of machines very 
little work has been done on solving many of the problems. My research 
on it goes back almost a decade starting out with Horus.

> Failing that, Glyph has hinted at another approach. You could elect a 
> small number (~1%) of your nodes as "proxies" so that as well as being 
> clients, they act as intermediaries for messages. This is a simple form 
> of overlay network, which you also stated you didn't want to use - lord 
> knows why. People use these techniques for a reason - they work.
> 

I know about overlay networks, gossip networks, etc. I have used both 
and would prefer something simpler. That is the reason for my pushing on 
this group - to see what ideas people might have. I appreciate Glyph's 
comments and perspectives - very refreshing - in contrast to the many I 
have gotten.

> You *want* (have decided you want) a reliable multicast protocol over 
> which you'll layer a simple RPC protocol. RMT (reliable multicast 
> transport) is as yet an unsolved problem. It is VERY VERY hard. None 
> exist for Twisted, to the best of my knowledge. I would be willing to 
> bet money that, for "thousands" of nodes, the overhead of implementing 
> such a protocol (in Python, one presumes) would exceed the overhead of 
> just using TCP. If you had said "hundreds of thousands" of nodes, well, 
> that would be different.
> 
> If you want to knock an RMT up based on the assumption you won't drop 
> packets, then be my guest, but I would suggest that if you *really* 
> believe multicast is that reliable, then your experience of IP multicast 
> networks has been a lot more rosy than mine, and I run a very large one.
> 
> "reliable multicast" into google would be a good start - there are some 
> good RFCs produced the the rmt IETF working group.
> 

Actually I am part of the IRTF group on P2P, E2E and SAM. I know the 
approaches they are being tossed about. I have tried to implement some 
of them. I just am not of the opinion that smart people can't find 
solutions to tough problems.

Is multicast or broadcast the right way? I don't know, but I do know 
that without trying we will never know. Having been part of the IETF 
community for a lot of years (I was part of the group that worked on 
SNMP v1 and the WinSock standard), I know that when the "pedal meets the 
metal" sometimes you discover interesting things.

>>
>> If I use TCP and stick to the serial, synchronized semantics of RPC, 
>> doing one call at a time, I have only a few ways to solve the problem. 
>> Do one call at a time, repeat N times, and that could take quite a 
>> while. I could do M spawnProcesses and have each do N/M RPC calls. Or 
>> I could use M threads and do it that way. Granted I have M sockets 
>> open at a time, it is possible for this to take quite a while to 
>> execute. Performance would be terrible (and yes I want an approach 
>> that has good to very good performance. After all who would want poor 
>> to terrible performance?)
> 
> Knuth and his comments on early optimisation apply here. Have you tried 
> it? You might be surprised.
>

I am sorry to say I don't know the paper or research you are referring 
to. Can you point me to some references?

> I have some Twisted code that does SNMP to over a thousand devices. This 
> is, obviously, unicast UDP. The throughput is very high. A simple 
> ACK-based sequence-numbered UDP unicast will very likely scale to 
> thousands of nodes.
>

Thanks for the information. This is what makes me think that I want 
something based on UDP and not TCP! And if I can do RMT (or some variant 
of it) I might be able to get better performance. But, as I said it is 
the nice thing about not having someone telling me I need to get a 
product out the door tomorrow! I have time to experiment and learn.

>>
>> So I divided the problem down to two parts. One, can I reduce the 
>> amount of traffic on the invoking side of the RPC request? Second, is 
>> how to deal with the response. Obviously I have to deal with the issue 
>> of failure, since RPC semantics require EXACTLY-ONCE.
> 
> How many calls per second are you doing, and approximately what volume 
> of data will each call exchange?
> 
This is information I can't provide since the system I have designing 
has no equivalent in the marketplace today (either commercial or open 
source). All I know is that the first version of the system I built - 
using C/C++ and a traditional architecture (a few dozens of machines) 
was able to handle 200 transactions/minute (using SOAP). While there 
were some "short messages" (less than an normal MTU), I had quite a few 
that topped out 50K bytes and some up to 100Mbytes.

Doing some research I have been told to expect a great many short ones 
and many very long ones; sort of an inverted bell curve. But there are 
very few real statistics. As I said I have to put a stake in the ground 
and build something so I am guessing where the problems might rest and 
trying to find some solutions for them. Hence my query.

> You seem inflexible about aspects of the design. If if were me, I'd 
> abandon RPC semantics. Smarter people than anyone here have argued 
> convincingly against making a remote procedure call look anything like a 
> local one, and once you abandon *that*, RPCs look like message exchanges.
>

I agree. I am not sure where the answer lies. I like Twisted because it 
affords a nice way to experiment with different mechanisms both at the 
transport and the semantic layer. I am looking for ideas! As I said I 
have the time and inclination to experiment. What I need are things that 
aren't obvious (because I haven't heard of them or thought of them).

>>
>> That gets me to the multicast or broadcast scheme. In one call I could 
>> get the N processors to start working. Now I just have to solve the 
>> other half of the problem: how to get the answers returned without 
>> swamping the network or how to detect when I didn't get an answer from 
>> a processor at all.
>>
>> That leads me to the observation that on an uncongested ethernet I 
>> almost always have a successful transmission. This means I have to deal 
> 
> Successful transmission is really the easy bit for multicast. There is 
> IGMP snooping, IGMP querier misbehaviour, loss of forwarding on an 
> upstream IGP flap, flooding issues due to global MSDP issues, and so forth.
> 

I agree about the successful transmission. You've lost me on the IGMP 
part. Can you elaborate as to your thoughts?

>> with that issue and a few others. Why do I care? Because I believe I 
>> can accomplish what I need - get great performance most of the time, 
>> and only in a few instances have to deal with do the operation over 
>> again.
>>
>> This is a tough problem to solve. I am not sure of the outcome but I 
>> am sure that I need to start somewhere. What I know is that it is 
>> partly transport and partly marshalling. The semantics of the call 
>> have to stay fixed: EXACTLY-ONCE.
> 
> If you MUST have EXACTLY-ONCE group communication semantics, you should 
> use a message bus.
> 

I do know I need EXACTLY-ONCE semantics but how and where I implement 
them is the unknown. When you use TCP you assume the network provides 
the bulk of the solution. I have been thinking that if I use a less 
reliable network - one with low overhead - that I can provide the server 
part to do the EXACTLY-ONCE piece.

As to why I need EXACTLY-ONCE, well if I have to store something I know 
I absolutely need to store it. I can't be in the position that I don't 
know it has been stored - it must be there.

Thanks for the great remarks....I look forward to reading more.

Chaz