[Twisted-Python] Performance issues of twisted.

Maarten ter Huurne maarten at treewalker.org
Sun Apr 13 15:34:01 EDT 2008


On Sunday 13 April 2008, Andy Fundinger wrote:

> > 5. Garbage collection might make the server halt for a moment
>
> I think this should be less than the latency of a publicly routed IP
> network, anyone have figures for gc and twisted?

There are two types of garbage collection in Python:
- reference counting
- mark and sweep

The reference counting is always active and does most of the collection. 
However, it cannot collect objects that are unreachable but have cyclic 
references between them, so the mark and sweep runs once in a while to 
reclaim those.

The mark and sweep has several levels: it will frequently check whether 
recently created objects are still reachable; less frequently it will check 
reachability of all objects.

The reference counting overhead is spread very evenly over time, so there is 
no latency problem there.

The mark and sweep operation locks the entire Python interpreter. The time 
it takes depends on the number of objects allocated. I did some 
measurements once, but unfortunately I haven't kept the numbers. However, 
it went up very quickly and by the time you have a couple of GB of data you 
can expect hickups of multiple seconds. It should be easy to reproduce 
this: just read the documentation of the "gc" module and write a small 
benchmark program that allocates lots of objects and then forces the most 
thorough level of mark and sweep.

What I did for our web app is disable the mark and sweep algorithm 
(gc.disable()) and break reference cycles in our code.

Bye,
		Maarten
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part.
Url : http://twistedmatrix.com/pipermail/twisted-python/attachments/20080413/5a59076d/attachment.pgp 


More information about the Twisted-Python mailing list