[Twisted-Python] twisted TCP frame size

Thys Meintjes thys at quaint.co.za
Mon Feb 27 05:20:41 EST 2006


Hi,

I've recently written a threaded client server application using a
custom message protocol. Each message is 49 bytes long and needs to
arrive at the client as soon as possible. 

In the current app I send() each message as soon as all 49 bytes are
available, I also recv(49) bytes at the client side (buffering as
necessary). The threaded application thus always reads 49 bytes from the
network buffers as soon as it is possible to do so. This application
consumes about 9% cpu time at full throttle.

Because I already use the twisted.enterprise adbapi on the client side I
decided to get rid of all the client threads and use twisted for the TCP
stuff as well. I've written a small test script to determine basic loads
and performance:

-----------------
#! /usr/bin/env python

import psyco
psyco.full()

import array
import sys

from twisted.internet.protocol import Protocol,ReconnectingClientFactory
from twisted.internet import reactor

f = open("wakka", "w")

class TRAUReceiver(Protocol):

    def dataReceived(self, data):
        print len(data)
	f.write(data)
	# f.flush()        

    def connectionMade(self):
        """ * connection were made, send signon"""

        print "Signing on...",        
        signon = array.array('B', [ord('T'), 1 ,255, 1, 255]) 
        self.transport.write(signon.tostring())
        print "Done"


class TRAUClientFactory(ReconnectingClientFactory):

    def startedConnecting(self, connector):
        print 'Started to connect.'

    def buildProtocol(self, addr):
        print 'Connected.'
        print 'Resetting reconnection delay'
        self.resetDelay()
        return TRAUReceiver()

    def clientConnectionLost(self, connector, reason):
        print 'Lost connection.  Reason:', reason
        
        ReconnectingClientFactory.clientConnectionLost(self,
                                                       connector, reason)

    def clientConnectionFailed(self, connector, reason):
        print 'Connection failed. Reason:', reason
        ReconnectingClientFactory.clientConnectionFailed(self,
                                                         connector, reason)

reactor.connectTCP('localhost', 55555, TRAUClientFactory())
reactor.run()
------------------

Connecting the above to the server yielded somewhat surprising results:

The length of the dataReceived() data between runs varies between 49 and
multiples of 49 bytes. I understand this (I think) as the ethernet
packet length sweet spot is about 1.3kB and Protocol is propably
optimized arround that. This does imply that I need a more elaborate
frame caching scheme on the client side than the one I currently have,
as another socket connection signals if messages from the current
connection must be stored or discarded. 

Surprisingly, the script data length is reported as continuous 49 bytes
at leas 4/10 times it's run, on other runs the print len(data) line
looks something like this:

49
12691
49
7938
49

Each second dataReceived() callback is 49 bytes, the rest multiples of
49.

When the script reports all frames as 49;  cpu consumption is ~33%, when
staggered it's ~4%. The dataReceived() call seems overly expensive when
compared with the 'raw' synchronous recv().

Paradoxically, padding the 49 byte message with 1000 pad bytes improves
the script's performance 8 times; due to the decrease in  dataReceived()
calls probably.

So, 
1) is there a way to force dataReceived() to return when a certain data
length has been received ? 
2) Why is dataReceived() so expensive (if it is) ?
3) Is Protocol the correct tree or are there other ways to handle small
time sensitive messages in twisted. 

Apologies for the longish post.

Regards
Thys






More information about the Twisted-Python mailing list