[Twisted-Python] Memory leak//problem in twisted write procedures

Joshua Moore-Oliva josh at chatgris.com
Thu Sep 16 20:44:28 EDT 2004


>   Whether or not "leak" is the appropriate term for this is debatable.  

I agree there, hence my use of Memory leak//problem :)

> 

> > Now, reading through the source to fix this problem, the fastest solution (requiring the least change to the existing code) would be to splice//reduce the size of the 
> > dataBuffer after offset exceeds a certain number.
> > 
> 
>   It is the easiest change to make, but it leads to detrimental string copying behavior.  
>>   Which this is a minor concern in the case where many small writes are being made while the buffer is large (because huge amounts of copying is already going on), 
>>   it is a noticable slowdown in the more common case when the buffer is typically empty or almost empty.  

I do not think my idea was properly communicated..

An interim change I think could work quite well along these lines 

from the abstract.py file
def doWrite(self):

      ...
        # If there is nothing left to send,
        if self.offset == len(self.dataBuffer):
            self.dataBuffer = ""
            self.offset = 0
            ...
       elif self.offset > 1000000: #This would be best pre-defined somewhere
           self.dataBuffer = self.dataBuffer[self.offset:]
           self.offset=0
        
        return result

Note that these changes are just done in 'real-time' while writing my email, no guarantees that the change will even let pythons tartup without syntax errors :)

However, notice the check for offset > a particular value.

This would prevent performance degredation for the more common case where the buffer is typically empty or almost empty, as the string would only be 
reduced in the event that too much memory was being wasted (even 10MB would be a good limit)

> 
>   Jp
> 
>   * A change _should_ be made to Twisted eventually.  A good solution would involve a zero-copy buffering system, such as a list.  There is an implementation of this, but it involves so many nasty hacks that I don't feel it is worth including.  Shortly after 2.0 I plan to find time to clean up many of the low-level TCP implementation details, as they have grown increasingly crufty over the last year.

I agree with you that a list would be the best solution in the long term, but the above proposed solution would remove the current problem, as well as not affecting the more
common small buffer cases.

Joshua Moore-Oliva




More information about the Twisted-Python mailing list