[Twisted-Python] pushing out same message on 100k TCPs

Phil Mayers p.mayers at imperial.ac.uk
Fri Feb 10 12:49:26 EST 2012


On 10/02/12 16:56, Tobias Oberstein wrote:
> Hi there,
>
> what is the most efficient/performant way of doing the following?
>
> I have a short message prepared .. say a string of 100 octets.
> I want to push out that _same_ string on 100k connected TCPs (on a server).
>
> ==
>
> My thinking was: ideally, the 100 bytes would be transferred to kernel/NIC space
> just _once_, and then the kernel is only told to resend that buffer on the 100k
> connected TCPs.
>
> Does that make sense, is that even possible, with Twisted, or in general?

Not really, no.

The problem is that TCP requires the sender of data to buffer so that it 
can re-send. The only way to store one copy of the data whilst doing 
this would be to store the socket buffer as a (fairly complex) linked 
list of reference-counted blocks, and use scatter-gather IO to the 
network card.

So the kernel would have to copy the data 100k times anyway, to store it 
in the per-socket buffer until it was ACKed, or maintain a large and 
complex data structure so that it could use one copy.

Therefore, by moving the work to the kernel, all you've done is consume 
valuable kernel memory, in return for saving the syscall overhead. 
Classic space/time tradeoff.

If you were using UDP, then in theory this might be possible, but there 
are no APIs that I know of, except for multicast (where you only send 
one copy of the data, and the network duplicates it).

In short; this kind of thing seems easy and desirable but actually it's 
really hard and not useful.



More information about the Twisted-Python mailing list