[Twisted-Python] OT - adbapi, connection timeouts, mysql - OT

Clay Gerrard clay.gerrard at rackspace.com
Wed Aug 26 17:08:30 MDT 2009


The way that sqlalchemy implements pool_recycle is "irrespective of idle time" best I can tell from looking at it.  Keeping track of the last_used_time seems like it would have been equally reasonable, although perhaps even additional overhead.

I'm not sure I understand what you mean by "parameterize the reactor" - pass it as an argument when you initialize the (Re)ConnectionPool?  Can you point me to a twisted class that does this as an example?

I agree it would be very cool "to do the reconnecting more proactively in idle threads" - I'll have to think on that, do you know off-hand of any analogs?

Thanks for the feedback,

Clay Gerrard
Office: 210-312-3443
Mobile: 210-788-9431

-----Original Message-----
From: twisted-python-bounces at twistedmatrix.com [mailto:twisted-python-bounces at twistedmatrix.com] On Behalf Of exarkun at twistedmatrix.com
Sent: Wednesday, August 26, 2009 5:26 PM
To: Twisted general discussion
Subject: Re: [Twisted-Python] OT - adbapi, connection timeouts, mysql - OT

On 09:49 pm, clay.gerrard at rackspace.com wrote:
>I see this issue (ConnectionLost on MySQL) come up at least once a 
>month.  It's actually the reason I originally joined this discussion 
>list, but I've yet to see a wholly satisfactory solution.
>
>Since there is no one true perfect generic way to safely rerun a failed 
>interaction after ConnectionLost - the smart thing to do is just raise 
>the exception and let the application deal with (or bomb out on) what 
>*should* be a rare occurrence indicating some sort of problem.
>
>adabapi does exactly this.
>
>However, with MySQL (>=v5.0 at least) you HAVE to make plans to deal 
>with ConnectionLost (or "MySQL server has gone away" if cp_reconnect != 
>True) or your twisted app WILL be quite broken.  Unfortunately, there's 
>no obvious way to separate a real connection "problem" from the 
>perfectly normal behavior of the MySQL server closing an idle 
>connection without making your code provider specific.
>e.g.
>from MySQLdb import OperationalError
>
>I've seen (and even begrudgingly implemented) non provider specific 
>solutions that just create a new connection every time, or that qualify 
>connections returned from the pool with "good_sql" before handing them 
>over to the app.  But, that overhead is obviously not acceptable in 
>many situations.
>
>I suggest "the right way(tm)" to fix this is with an optional 
>pool_recycle kwarg that sets the maximum time that a connection can be 
>reused before it must be recycled (many ORM's provide precedent for 
>this solution - sqlalchemy, dajngo.db, etc.)

Is this irrespective of idle time?  It sounds like not, though the MySQL 
idle timeout is the motivation you mention above.
>
>reconnectionpool.py
>
>'''
>Created on Aug 26, 2009
>
>@author: clayg
>'''
>
>from time import time
>
>from twisted.enterprise import adbapi
>from twisted.python import log
>
>class ReConnectionPool(adbapi.ConnectionPool):
>    '''
>    subclass of adbapi.ConnectionPool that supports pool_recycle
>    pool_recycle disabled by default (-1)
>    '''
>
>
>    def __init__(self, *args, **kwargs):
>        self.pool_recycle = kwargs.pop('pool_recycle', -1)
>        self.conn_starttime = {} # connections starttime, hashed on 
>thread id
>        adbapi.ConnectionPool.__init__(self, *args, **kwargs)
>
>    def connect(self, *args, **kwargs):
>        # ask ConnectionPool for a connection
>        conn = adbapi.ConnectionPool.connect(self, *args, **kwargs)
>        # if pool_recycling is enabled
>        if self.pool_recycle > -1:
>            # get the start time for this connection
>            tid = self.threadID()
>            starttime = self.conn_starttime.get(tid)
>            now = time()
>            if not starttime:
>                # init starttime for new conn
>                self.conn_starttime[tid] = starttime = now
>                log.msg("Connection %s was created at %s." % (tid, now))
>
>            # if the connection is older than limit in pool_recycle
>            if (now - starttime >= self.pool_recycle):
>                self.disconnect(conn)
>                log.msg("Connection %s was recycled at %s." % (tid, 
>now))
>                conn = adbapi.ConnectionPool.connect(self, *args, 
>**kwargs)
>                self.conn_starttime[tid] = now
>
>        return conn
>
>I think it be quite a bit less messy if it was implemented inside the 
>ConnectionPool class instead of a subclass.  Someone else could 
>probably do a much better job than I, although I wouldn't mind taking a 
>crack at it.
>
>Thoughts?

This could be a cool feature.  I'm sure a lot of people would certainly 
be happy for there to be a simple answer to the monthly MySQL question 
you referred to above.

Some random thoughts:

  * you should use the reactor to keep track of time - reactor.seconds() 
should someday actually be more reliable than time.time() (but it's the 
same today); also, parameterize the reactor and use reactor.seconds() 
and the code is more easily testable.
  * this will throw reconnect overhead in front of whichever query 
happens to get issued to a thread which has a connection which needs to 
be recycled.  If it avoids a connection lost error, that's great; if it 
doesn't, then it's just overhead.  This, combined with the fact that 
queries are dispatched to basically a random connection, might not be 
too nice.  It could be cool to do the reconnecting more proactively in 
idle threads and improve the dispatch logic to give preference to fresh, 
available connections when possible.
  * tests tests tests ;)

Jean-Paul

_______________________________________________
Twisted-Python mailing list
Twisted-Python at twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use of the
individual or entity to which this message is addressed, and unless otherwise
expressly indicated, is confidential and privileged information of Rackspace.
Any dissemination, distribution or copying of the enclosed material is prohibited.
If you receive this transmission in error, please notify us immediately by e-mail
at abuse at rackspace.com, and delete the original message.
Your cooperation is appreciated.





More information about the Twisted-Python mailing list