[Twisted-Python] major changes, release engineering, and learning cost

Sat May 22 20:26:33 EDT 2010

The nice thing about Twisted's compatibility policy is that developers, and even users, very rarely have problems when installing a new version of Twisted.  While this is a nice benefit, the current strategy of developing features in a compatible way does have a couple of costs, and I'd like to see if we can address them without giving up the benefit.  I have a suggestion for a process tweak which would hopefully mitigate some of the difficulties which arise due to the compatibility policy.

When we add a new feature that supersedes an older one, or fix a bug in Twisted that involves changing behavior, the developer fixing it has to come up with a new name.  If we have several behavior-changing bugfixes in the same subsystem, that means that developers using Twisted may have to learn about 3 different symbol names.  Since we tend to avoid just suffixing names with numbers (for good reason, I think), they won't have to learn Bicycle, Bicycle2, Bicycle3, they'll have to learn Bicycle, then Footcycle, and finally Velocipede, and somehow infer that Velocipede is the newest/best name that they should be using, by reading the (hopefully clear, concise) warnings that come out of their unit tests.

This came up again recently on a ticket about URLPath,  <http://twistedmatrix.com/trac/ticket/2625#comment:16>, where a contributor suggested that it would be better to make a whole new module because it's easier for external developers to learn about that then learn about an individual method change.  This of course raises the question: if we're going to have a whole new URL class, shouldn't it fix the (numerous) *other* bugs that we know about in URLPath?

Up until now the objection to doing things this way is that it results in gigantic branches which are intractable to review.  That's a good objection, but it leaves us with a false dichotomy; reliable software and painless upgrades with a random scattershot of new features that are hard to understand, or coherent iterations of technology which can't be effectively reviewed, and therefore can't be effectively quality controlled.

I propose that we get the best of both worlds by changing the way we do reviews slightly.  Right now every code review needs to be done on an entire feature going to trunk, and _all_ of the code going to trunk needs to be reviewed at once.  I suggest that instead, we create "integration branches" for sensible constellations of features, and have a two-stage review process.

For example, let's say I want to work on making URLPath good.  There are several tickets addressing it:

<http://twistedmatrix.com/trac/ticket/2093>
<http://twistedmatrix.com/trac/ticket/2094>
<http://twistedmatrix.com/trac/ticket/2625>

For the sake of argument, let's all pretend these are all deeply interrelated and involve changes to behavior of existing methods.  I think that is sort of true of most of these, but it would be far too verbose to talk about *how*, and get bogged down in that discussion.

First, I'd make an integration ticket, let's call it #ABCD, describing how these features are related and a brief outline of the new API I propose which resolves them.

Then I'd create an integration branch from trunk, for that ticket.  From the #ABCD branch, I'd create a branch for #2093, and put it up for review.  The reviewer would review #2093 as usual, citing any test coverage issues, documentation issues, etc.  After the usual review process, when I get an "OK to merge", I would merge #2093 *to the #ABCD branch*, not trunk.

I would repeat this process for #2094 and #2625, merging each to the #ABCD branch as they passed review.

Finally, I'd put #ABCD itself up for review.  At this point the process would differ a bit.  Reviewers would be able to assume, at this point, that the potentially large body of code in #ABCD had already been reviewed, that the test cases were good, the documentation was reasonably clear, and the logic made sense.  This final review would be a quick sanity check, to make sure the tests still pass and that there are no conflicts.

I would like to strongly emphasize that this point in the process would be an inappropriate time for reviewers to start arguing with each other over what is required for the branch to land, disputing the original specification, etc; this is just an opportunity to spot potential regressions before it lands.  Each ticket review for a component of the larger feature should be an opportunity to draw attention to the direction of the larger feature development and prompt discussion.  This *might* be an appropriate point to note that some other behavior-changing feature had been left out, though.

In the case that there *were* conflicts, this would be an opportunity to review the conflict resolution itself.

(We saw a nascent version of this approach on some stuff related to <http://twistedmatrix.com/trac/ticket/886> and it was hugely painful because nobody was really sure what the process was supposed to be.  So let's not do it like that again.)

So: thoughts?  Does this make sense as a policy change for facilitating the development of major new features or consolidating behavior-changing fixes into easier-to-understand units?