[Twisted-Python] pushing out same message on 100k TCPs

Tobias Oberstein tobias.oberstein at tavendo.de
Sat Feb 11 07:37:38 EST 2012


> Now, as I understand it, sendfile() will perform zero-copy IO; since the contents
> of the file will undoubtedly be in the page cache, it should in theory DMA the
> data straight from the (single copy of the) data in RAM to the NIC buffers.
> 
> It should also handle refcounting for you - you unlink the filename after
> obtaining a descriptor, and close() the FD once you've called sendfile, and the
> kernel *should* in theory free the inode and page containing file data once all
> TCP ACKs have been received.
> 
> You'll still have to make 100k syscalls, and you may find the kernel chooses to
> copy the data anyway.

I see.

So using sendfile .. probably with message as file on RAMFS .. or using the Linux
syscalls you mention below, it _might_ be possible to avoid copy overhead,
but not context switching overhead .. ok.

> 
> However - AFAIK Twisted does not support sendfile(), and it can be tricky to
> make it work with non-blocking IO.

;(

Apart from that, we're on FreeBSD .. guess there are similar syscalls (maybe with
slightly different semantics) there also.

> 
> :o(
> 
> You may also want to look at the splice() vmsplice() and tee() syscalls added to
> recent Linux kernels. tee() in particular can copy data from pipe to pipe without
> consuming, so can be repeated multiple times. It may be possible to assemble
> something that will do this task efficiently from those building blocks, but the
> APIs aren't available in Twisted.

Thanks alot! This is all very interesting .. from the "tee" man page:

"""
Though we talk of copying, actual copies are generally avoided. The kernel does this by implementing a pipe buffer as a set of reference-counted pointers to pages of kernel memory. The kernel creates "copies" of pages in a buffer by creating new pointers (for the output buffer) referring to the pages, and increasing the reference counts for the pages: only pointers are copied, not the pages of the buffer. 
"""

Which sounds alot like in your other reply talking about refcounting etc .. 

For ref., these guys are talking about PACKET_MMAP

http://www.linuxquestions.org/questions/programming-9/vectored-write-to-many-sockets-with-tee-splice-915702/
http://dank.qemfd.net/dankwiki/index.php/Fast_UNIX_Servers

The former (very end of page) claims that it achieves zero-copy (which I get),
and also claims you could reduce context switch overheader for the 1 msg
TX to many clients case .. which I can't see how it's done.

> 
> >> and not useful.
> >
> > When using VM pages (_if_ that would be possible) and thus no data
> > duplication, then why not useful?
> 
> Sorry, I should have been more precise - it's probably not often useful.
> There are not very many applications where sending the same TCP stream to
> that many clients at the same time is helpful - realtime video/audio over TCP
> spring to mind, and typically those need to adapt to slow clients by dropping
> them to a lower rate i.e. not the same stream any more.
> 
> As Glyph has mentioned, encryption is also a factor in todays internet.
> 
> I'm kind of curious about what your application is!

The application is PubSub over WebSockets with massive numbers of clients ..

Application message payloads are short (<1k) and JSON/UTF-8. Those are then
framed into WebSocket messages (which basically means prepending a WS
frame header).

> 
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python



More information about the Twisted-Python mailing list