[Twisted-Python] trac's reliability
exarkun at divmod.com
exarkun at divmod.com
Mon Nov 27 10:07:32 EST 2006
On 12:31 am, jml at mumak.net wrote:
>The reliability of our trac instance is ludicrous. It is becoming
>extremely difficult to do any work on Twisted, particularly during the
>hours when America is asleep.
>I'm very grateful for Jp's tireless efforts in making trac work as
>much as it has. I have some idea of how busy he is, and can imagine
>how frustrating the task must be. However, we can't continue running
>our issue tracker like this.
>First question is, what is causing the outages? People on #twisted
>have commented that they haven't seen similar behaviour on their own
>tracs. The outages are so frequent that this is becoming an FAQ.
There are several problems:
Segfaults in the svn bindings - in correspondence with the trac team, I have been told (almost in so many words) that bdb-backed svn repositories are unsupported and we should switch to fsfs.
Segfaults in the SQLite bindings - likewise, in correspondence with the trac team, I have more or less been told that SQLite is not a supported database engine and that we should switch to PostgreSQL.
Deadlocks in... who knows where.
>The second question is, how can I help trac to work better? Would it
>help to throw more hardware at the problem? Should we switch to
>another tracker? (blech) Are there open tickets on trac itself that I
>could submit patches for?
It might be possible to resolve the above mentioned problems.
If we convert the repository to fsfs, we might find the segfaults from the svn bindings disappear (of course, we might not - I think we can all recognize the quality of this sort of bug stomping). In addition to the actual task of converting, involved in this would be some level of investigation into the level of stability of the fsfs backend available in the version of debian used on wolfwood, an effort to package or have packaged a newer version, or an upgrade of wolfwood (however, since even edgy lacks svn 1.4, this probably isn't a useful endeavour). This may also involve recompiling several packages on pyramid to add fsfs support or remove bdb support.
If we convert the trac database to use PostgreSQL, the SQLite segfaults will hopefully go away. ;) This involves setting up a PostgreSQL server which we can use for trac (one is running on pyramid now, for the benefit of buildbot, beyond that, what state it is in is not clear to me). There is a tool available from edgewall which is supposed to be capable of moving data from a SQLite database to a PostgreSQL database. The various scripts and utilities which we have (eg, the weekly bug summary) may also need to be adjusted (I forget to what extent they are tied to SQLite). Then, as an ongoing task, someone will need to maintain the PostgreSQL server.
As for the random deadlocks which occur... I see no realistic course of action which is likely to resolve these. Perhaps, if the above changes are enacted, we will be lucky they will just go away by themselves. That seems to be the attitude of the trac developers, anyway.
>Finally, assuming Twisted's trac isn't going to get much better any
>time soon, I would greatly appreciate being given the permissions and
>training to restore trac. I think it would also be a good idea to
>share the responsibility with someone in a European timezone.
Before I left last week I set up a cron job to take care of this. When I returned it was still doing its job, so the level of availability seen over the past week or so may be the highest we can expect until we do something else fundamental to fix the issue. However, the SSH key you gave me long ago is still in place and you should still have access to restart the server (just connecting should do it).
Personally, I am in favor of switching away from trac, as I have been since
shortly before we adopted it. ;) The only open question is when the replacement will be ready.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Twisted-Python