[Twisted-Python] Lore, Sphinx, and getting to the finish line (was: re: lore and tickets and other stuff)

Sat Mar 2 00:35:53 EST 2013

On Fri, Mar 1, 2013 at 4:15 PM, Glyph <glyph at twistedmatrix.com> wrote:

>
> On Mar 1, 2013, at 9:44 AM, Kevin Horn <kevin.horn at gmail.com> wrote:
>
> On Fri, Mar 1, 2013 at 2:29 AM, Glyph <glyph at twistedmatrix.com> wrote:
>
>> Jean-Paul recently closed a Lore ticket as invalid, and suggested we have
>> a discussion about Lore's future direction.  This strikes me as a very good
>> idea, and so I wrote a message which is a bit too long (for which I
>> apologize) to kick that off.
>>
>> I don't think these two paths (lore2sphinx and continuing to maintain
>> lore) are necessarily mutually exclusive.  Also I think it implies
>> something about the current state of affairs that isn't accurate - e.g.
>> that the Twisted team has agreed that Sphinx will surely replace Lore and
>> that we are making progress on that process of placement more than we are
>> maintaining Lore itself.
>>
>> Unfortunately, I think it will be clear to anyone following its progress
>> that lore2sphinx is unmaintained and the sphinx migration effort is
>> stalled.  Nobody has committed to <
>> https://bitbucket.org/khorn/lore2sphinx> in a year and a half, about the
>> same amount of time that <
>> http://twistedmatrix.com/trac/browser/branches/sphinx-conversion-4500>
>> has been idle as well.  By contrast, <
>> http://twistedmatrix.com/trac/browser/trunk/twisted/lore> has seen
>> commits - albeit not many - within only a couple of weeks.  So,
>> empirically, we're already maintaining lore and lore2sphinx is currently
>> "obsolete"; really the question should be if we want to reverse that path.
>>
>>
> Some what orthagonal to your point, but this is incorrect.  lore2sphinx
> was split some time ago into "lore2sphinx-ng" and "rstgen".
>
>
> Hi Kevin!  Long time no see!  (Too long, obviously!)
>
> https://bitbucket.org/khorn/lore2sphinx-ng
> https://bitbucket.org/khorn/rstgen
>
> This was initially done as an experiment in using a more explicit
> "formatting model" for the generation for the Sphinx docs (and somewhat due
> to _your_ prodding, Glyph), and so I didn't initially make a big
> announcement or anything.
>
>
> I do remember this.  The previous output of lore2sphinx really was
> unreliable enough that it was creating a never-ending treadmill of
> irrelevant / unpredictable Lore source fixes that were really dragging the
> whole process out.  Thanks for working on improving it.
>
>
That "never-ending" series of Lore source fixes took place over the course
of a couple of weeks.  Doing things that way was not my idea, though it
seemed reasonable at the time because  the idea was that we would do the
cutover at the end of it.

> Once it became apparent that it was actually going to work out better, I
> sent out some emails to those who had expressed interest in helping with
> the whole lore2sphinx project, though I don't believe I sent out anything
> to the twisted list in general, as I probably should have.  I'll point out
> that I can count people who have shown interest in moving this forward on
> one hand, though.
>
>
> More discussion on this list would be almost always be better.  We are a
> *long* way from too much traffic here.  (And, this update is honestly a
> surprise to me.)
>
> And I've specifically mentioned that I had done said forking to you,
> Glyph, in IRC  ;)
> (though it's IRC after all...who remembers what happens in IRC?)
>
>
> Based on this exchange, my understanding was simply that you had started
> to try to improve lore2sphinx, but then wandered off again.
>
>
I never "wandered off".  Been here the whole time.  I've been in #twisted
almost continually for about the last 3 years, and in #twisted-dev for
about a year (I didn't relaize it existed before that). I just got tired of
(my perception) talking to myself about doing the conversion.  So I was
being quiet.  Granted, I shouldn't have been, and that's on me.  but it's
not like I'm hard to get a hold of.

> I thought I had put a notice up in the readme file in the lore2sphinx
> repo, but as it isn't there, I presume I either forgot, or never got it
> merged, or something.
>
> So, totally my bad for not communicating better, but I have NOT given up
> on converting things from Lore into Sphinx.
> (Nor do I intend to.)
>
>
> OK.  Let's move things along then.
>

Yes lets.

> Several people showed up on IRC yesterday and voiced an interest in
> helping out, although what to do next - especially what to do next for a
> new contributor who does *not* want to try to reverse-engineer the
> conversion itself - needs to be made much, much clearer.
>
>
The last day or two have probably not been the best to try and get my
attention, especially yesterday, as I essentially worked a 14 hr day trying
to meet a deadline. But I see the conversation on IRC.  I'll note that
noone seems to have considered asking me anything about it.  Looks like it
was about 4am, though, so perhaps that wouldn't have done much good, as I
was asleep. :)

But hey...I have email!  Ask me!  I'll talk your ear off about it!

(As an aside, lore2sphinx is in no way a "broken pile of regexes".  Not to
say that it isn't broken in some really significant ways, because it is,
but it doesn't use regexes at all.  Just sayin'.)

> Thinking about it, I suppose I've been somewhat reticent to do much
> communicating about any work I do on this, as what seems to happen is that
> it just gives everyone an excuse bring up some new objection to actually
> getting the conversion done.  I hadn't really realized
> this consciously until just now, though.
>
>
> Communicate constantly.  The biggest objection that _I_ have to getting
> the conversion done at this point is that the people working on it (well,
> okay: you) are uncommunicative, unreliable and frequently unavailable. ;-)
>  If you were just keeping us all up to date - even just to complain! - I'd
> be much more sanguine about the whole thing.  And apparently some of your
> misconceptions would have been corrected a lot earlier.
>
>
I got tired of complaining.  And arguing.

> I also have no objection if someone wants to complete the lore2sphinx
>> work, but if the lore2sphinx buildbot were to die tomorrow and go offline,
>> I wouldn't be particularly anxious to spend a lot of resources to fix it.
>>
>> My position on this was always that if someone wanted to improve the
>> documentation, they were welcome to do so, and if they wanted to use Sphinx
>> to do it, that's great too.  I just wasn't willing to tolerate any period
>> where our toolchain was broken and we couldn't generate documentation for a
>> release.  And a good thing we didn't, by the way!  If we had said "go
>> ahead, pull the trigger, whatever, it's OK to break trunk for a little
>> while!" we wouldn't have had any documentation toolchain for the last 2
>> years.
>>
>>
> And since we didn't break the toolchain, I've been in no particular hurry.
>  I've accepted that this will take approximately a billion years.  So no
> rush.
>
>
> It does not have to take a billion years.  The criteria ought to be clear
> - and if they aren't, you should have asked for clarification :).
>
>
I have asked for clarification more times than I can count about more
aspects of this than I can possibly keep track of.

> On the other hand, I have at several points been willing to make the
> "cutover", and for various different reasons, been told it wasn't happening
> until things were closer to "perfect" (for some value of "perfect") than
> they were at the time.
>
>
> Let's be specific: <http://twistedmatrix.com/trac/ticket/5312> is in need
> of some final code-review.  Despite several reviews and an apparently
> extensive final response pass, it's not currently in review, which means
> it's still in your court for some reason.  There is no reason to hold back
> on this and try to do *everything* in one big bang: this code just needs to
> be production-quality and land on trunk _before_ the ReST sources
> themselves are ready to go.
>
>
Despite numerous attempts to prod someone into responding to my requests
for clarification ;) on the ticket, I never got any response.
 Specifically, I could never get an answer on whether the sphinx build tool
should require whomever was running it to specify a version or whether the
tool should guess.  The existing tools (at the time, I haven't looked at
the current state of these) do/did both, in different places.

And I admit, my impetus for immediacy kind of crashed when I had spent
several weeks (I thought) getting everything ready to switch over the docs
(in 4500) and then being told "oh we have some release stuff, we need to
have a tool for that too".  My impression prior to this was that
sphinx-build would be used to build the sphinx docs, which turned out to be
erroneous.  I didn't even know that those tools (twisted.python._release)
even existed prior to that point.

Anyway, after a while it looked like fixing the lore sources would have to
be done all over again, so I started looking into whether the conversion
process itself could be improved, so that we didn't have to keep doing that.

Also, please elaborate on what you mean  by "do *everything* in one big
bang.  My intention was never to do anything but get the SphinxBuilder
working on that branch.  Was there something else you thought I was doing?
 Was there something else I should (or should not) have been doing?

> Probably something needs to happen to the buildbot build steps, too, since
> there's this nastiness that did an end-run around our development process
> to get checked in to the buildbot config without tests instead of into
> twisted with tests, <
> http://buildbot.twistedmatrix.com/builders/documentation/builds/2994/steps/process-docs/logs/stdio>,
> and that needs to be replaced with a command that's just like "build the
> docs, whether they be lore or sphinx or docbook or whatever".  But, Tom's
> got your back here; if you can get this done during his fellowship (see
> today's post, <
> http://labs.twistedmatrix.com/2013/03/welcome-our-new-twisted-fellow-tom.html>)
> I estimate you will see a completed reconfiguration within hours.
>

I have no idea about how the buildbots are configured.  But the linked
buildbot log looks like part of the official release process.
http://twistedmatrix.com/trac/wiki/ReleaseProcess#Buildhowtodocumentsforwebsite

> Once that's done, then it's a matter of putting <http://tm.tl/4500> into
> code-review with the output of the lore2sphinx builder.  That review can be
> somewhat expedited, and can be done in parallel by lots of people since
> there are no unit tests to be worried about, and formatting fixes can be
> done quickly by multiple people, we don't need a big formal code review.
>
> The current output of the old lore2sphinx branch is functional, though has
> a few warts (mostly extraneous spaces in the output).  These warts were
> apparently enough to block adoption.
>
>
> Let's not under-state the problem: thanks to the jaw-droppingly weird
> arbitrariness of the ReST format, "extraneous spaces" can mean "arbitrarily
> mangled output".  But no, even these "warts" were not enough to block
> adoption.  What blocked adoption is that the painstakingly hand-tweaked
> lore sources that did *not *have any more "warts" were left to languish
> (and bit-rot, and now probably require more manual fixing) while we waited
> for 2 years for someone to actually finish the sphinx development and
> release management tools and get them finalized.  As I recall we basically
> finished fixing them all up, at the time.
>

They got left alone because of the release tools hangup.  Ideally the
release tools would have been done before the whole lore-source-tweaking
process, but they weren't.  I'll admit my frustration played a part in
this, but so did the deafening silence I got when I asked for anyone to
comment on the ticket.

> There were three reasons that I personally kept pressing for a more
> thorough lore -> sphinx converter.  One is not necessarily necessary.
>
> First, and most importantly, is the bit-rot problem: people are working on
> lore docs in parallel with this effort.  And, despite this exchange, I want
> to be clear that they should keep doing so: nobody should stop working on
> docs in the meanwhile, since we have no way to tell how much longer this
> will take.  Looking at the modified docs on the sphinx buildbot is
> challenging, and keeping track of random whitespace jiggling is not
> documented on <
> http://twistedmatrix.com/trac/wiki/ReviewProcess#Reviewers:Howtoreviewachange>.
>  *I* can't even remember how to do the math to associate one of the results
> in <http://buildbot.twistedmatrix.com/builds/sphinx-html/>.  And now that
> there have been so many changes (as I predicted there might be) we have to
> figure out what's changed, and re-review to make sure that everything (or
> at least a big enough majority of everything) is OK to go to trunk.  If the
> tool itself could be verified to produce correct output for all the cases
> we've encountered where it falls over, we wouldn't have to do this manual
> verification step; we could just trust that it was right, because it has
> tests that indicate it's correct.  Of course it's possible there might be
> *some* corner-case it still doesn't handle and that we didn't find, but if
> the tool is known to be broken in a large number of cases that we just have
> to magically know to avoid, then it's likely people will keep unknowingly
> re-introducing those problems.
>

More on this below.

> Second, there are going to be some doc patches in-progress whenever the
> cutover happens.  Now, this is a bit less of a concern, because we can just
> manually translate one or two paragraphs to the new markup if necessary.
>  But it would still be nice to have a tool that does the job well enough
> that someone could grab the buildbot output for an in-progress doc fix and
> keep working on it without having to learn how to re-express everything in
> Sphinx first.
>

This is why I think (at this point) we need to build Sphinx docs for every
branch as part of the buildbot process.  More below.

> Third, the output is just hella grody right now.  Have a look here, for
> example: <
> http://buildbot.twistedmatrix.com/builds/sphinx-html/989-37334/_sources/projects/web/howto/twisted-templates.txt>.
>  *Tons* of peculiarly and unnecessary vertical whitespace, and very ragged
> right edges where the word wrap doesn't seem to respect line lengths.  This
> means that every change that hits these documents is going to produce a lot
> of unnecessary delta when authors try to clean up some of this mess to make
> it nicer to edit.
>

Yep, its' ugly.  Lore2sphinx-ng does a better job, but isn't finished.
 More below.

> Spot-checking some of the output now, it seems like the tool must have
> been upgraded, or we've been lucky, since I can't spot any obvious bit-rot
> (and I could swear the docs look a lot less grody; the problems I mentioned
> there).  So maybe you've already addressed these problems, or they're not
> actually that serious any more.  But, as I said in the first point,
> spot-checking isn't enough.
>
> It has been a pretty discouraging effort at times, I have to say, as I
> seem to garner agreement/support/buy-in/whatever for a particular course of
> action (e.g. getting 99% of the way there, and then fixing Sphinx markup
> manually, which was the original plan, way back when), and focusing my
> efforts in that direction.  Then when we're ready to proceed on that basis,
> had another task/challenge/set of requirements/whatever added to the work
> that needs to be done.  In fact I still think that if the Twisted community
> had actually wanted to, we could have switched over to Sphinx at the first
> PyCon Atlanta (2010?).
>
>
> By 'actually wanted to' you mean 'be willing to abandon the development
> process for this one thing'.
>
> We do not abandon the development process.  Every past attempt at doing so
> to facilitate some feature has been a road to ruin.  Although this process
> has been frustrating for you, I am still happier with the current outcome
> (Twisted has perfectly functional documentation in our downloads and on our
> website) than with the alternative (create a situation where we could not
> produce a release for two years because the tools were languishing
> unfinished while we waited for you to say something about it).
>
>
You keep saying that I wanted to "abandon the development process", and I'm
not sure what you mean by that.  My perception has been that I would say
"what do we need to do to make this happen"?  There would be some hemming
and hawing (and at least several times long discussions about how
documentation didn't really fit the regular UQDS process) and a sort of
plan would be invented.  I would proceed according to the plan as I
understood it.  I would then say "OK, we're ready"!  And then be told that
some other thing not in the plan needed to be done.  The cycle would then
repeat.

> I'm sorry that this has been a frustrating process for you.  And I'm not
> just saying that to be polite: I genuinely *am* sorry that our
> communication has not been clear, and that we have had wasted effort all
> around because of that.  But I am fairly sure that we have had basically
> the same requirements for this process from day one.  Let me state them
> here:
>
>
>    1. We need to have release-automation tools that allow developers to
>    produce a release, including documentation.  These tools need to be
>    subjected to the same development process as the rest of those tools, which
>    is to say the same process as for the rest of Twisted.
>
> No this was not brought up until well into the process. I (sort of)
understand the desire for this, but it seems pretty weird to be building
what is essentially a wrapper for an existing tool, along with tests for
said wrapper,

>
>    1. The documentation itself needs to be able to be generated from any
>    version of trunk.  While one or two formatting snafus are acceptable to be
>    fixed after the fact, the documentation needs to be in a comprehensible
>    state in every revision of trunk, which means that in order to land on
>    trunk, the ReST output.
>
> So...you didn't finish that sentence.  I realize you apologized for errors
at the end of your mail, but I have a feeling you were going to say
something rather important there...

:)

I'll talk more about this below (I think...depending on what you actually
mean tot say here).

> Really, most of the work has been done here already.  The docs appear to
> be in a mostly-workable state.  lore2sphinx looks like maybe it's doing a
> good enough job, maybe better than the last time I looked at it.  The
> _major_ hang-up is getting the release management tools over their final
> hump and just driving the trac tickets to completion.  With Tom keeping the
> review queue basically empty right now, this is an excellent opportunity to
> get that done.
>
> It may make sense to schedule an event where we all show up on IRC,
> everyone claims a documentation component, and we all do a final review
> pass to make sure that the formatting problems aren't too bad before going
> to trunk with the cut-over.  This pre-supposes that the release/building
> tools are done and on trunk though.
>
> Anyway, I'm not giving up.  If nothing else, I'll end up with a nice
> restructuredText-generating library.  And if Twisted never ends up adopting
> Sphinx as a doc tool, eventually I'll still be able to read the Twisted
> docs in a format that I can navigate and doesn't hurt my eyes to look at. :)
>
> But I'd really rather see Twisted adopt Sphinx, and get rid of Lore.
>
> Help accepted.
>
>
> All right!  I hope this exchange has gotten some people fired up to cross
> the finish line.  It's surprisingly close!  Thanks for updating us, Kevin -
> better late than never :).
>
>
Experience shows that it's unlikely to be surprisingly close.  I like your
optimism though.

> -glyph
>
> P.S.: apologies for any errors.  I didn't even really have the time to
> write this email, let alone copy-edit it.
>
>
Now that I've replied to all of that, let me give you a rundown of what
I've been thinking and planning, so that you have an idea of where I'm
coming from.

Here are the various things that I have perceived to be necessary/required
in order to get the conversion to happen:

a) The conversion process needs to be able to be run concurrently with Lore
for an extended period of time.  In other words, Lore would be the
"official" version of the docs, and the Sphinx docs would be built in some
form of automated fashion until everyone was happy with them and/or ready
to deprecate/abandon Lore.
b) Because of a), there needs to be tooling to run lore2sphinx (or
whatever) on a regular basis.  (This was sort of being done via the
Sphinx-building buildbot, but in a very ad-hockery sort of way, which was
brittle, broke a couple of times, and needed to be improved.)
c) There needs to be release management tooling to build the Sphinx docs
from ReST into whatever formats we want to publish (HTML and PDF to start,
maybe others later on)
d) Convert the Lore sources to better ReST documents without all the
problems that the current lore2sphinx output has.  I at one time thought
this was pretty impractical.  My first attempt at a conversion tool tried
to use an intermediate object model, but I ran into trouble when trying to
combine the various objects.  So I abandoned the effort and created what
became lore2sphinx, which basically just combined a bunch of strings.  I
then figured out a way of making the intermediate object thing work, and
that was lore2sphinx-ng.  Then it became convenient to split out the
intermediate object model from the documetn processing code, so I put all
of that into a library and that became rstgen.

(For anyone who is curious, the lore2sphinx-ng repo is forked off from the
lore2sphinx repo, primarily because I didn't want to break the Sphinx
buildbot by making drastic changes.)

Here's what my plan was prior to this whole discussion getting started
again.

1) Finish rstgen, where "finished" in this instance is defined as "is
capable of generating all the vanilla docutils and sphinx-specific ReST
elements that we need for converting the
Twisted documentation.
2) Finish lore2sphinx-ng (which would probably have ended with merging it
back into the lore2sphinx repo), where "finished" means that it would be
capable of processing all the XHTML Lore tags that were defined in the Lore
documentation and used in the Twisted documentation, and generating a tree
of rstgen elements, which could then be rendered into ReST.
(this would also serve to satisfy b) above, as the CLI in lore2sphinx-ng is
less...well, let's just call it broken than lore2sphinx's was/is.)
3) Go back and finish SphinxBuilder (release tooling for building a sphinx
project, which is basically a wrapper for sphinx-build, plus some vague
"version feature").
4) Get someone to use something less hackish than what's currently building
the Sphinx docs on the buildbot, and preferably in such a way that the
results of those builds could be published somewhere and have persistent
links.  Currently the results of what the Sphinx buildbot does are stored
for a time, and then go away, so you'll see links to build results in some
trac tickets that go nowhere, which is decidedly unhelpful.  My plan was
that we'd set up something where the Sphinx docs would get generated and
published someplace for every buildbot build so that we could always have
the current results for the lore to sphinx conversion for the tip of each
branch.  I have no idea whether this is actually feasible or practical, but
it seemed like it would be useful.
5) Proceed with Sphinx docs being built from lore sources, making tweaks as
necessary to lore2sphinx(ng) for as long as it took for the generated docs
to be good enough to justify switching to Sphinx entirely.
6) Switch to Sphinx entirely.

I really wasn't planning on trying to get people excited about switching to
Sphinx again until 1) and 2) were at least "mostly" done (for certain
values of done) and I had gone back to finish 3).

So.  I guess at this point the question is whether to try and go with
what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng +
rstgen).  I think 3-6 in my above plan need to happen in any case, and I
think those will be much easier with lore2sphinx-ng+rstgen.

I think I have some changes to lore2sphinx and rstgen which I haven't
pushed yet.  I'll try to get those out there soonish (sometime over the
weekend) in case people want to look at them.

IIRC, rstgen has support for most of the vanilla docutils elements, with
the notable exception of tables (and maybe definition lists...can't recall
whether I finished those).  It has a basic level of test coverage (of
course you can never have too many tests) for rendering the elements
individually, and some test for elements in combination (particularly
nested lists).  Footnotes and Citations I think also need some work, which
I have a plan for, but haven't implemented yet (i don't think).

The "new" lore2sphinx CLI tool needs more work, but is relatively
straightforward.  Like the old tool, it's basically an elementtree
processor, except instead of spitting out strings that get joined together
(which granted was an unholy mess), it generates rstgen elements, which all
have a .render() method.  After processing a Lore document, you shoudl end
up with a rstgen.Document object.  You call it's render() method, which
calls it's children's render() methods, etc. and it's turtles all the way
down.

The framework is there for the new CLI tool, it's mostly a matter of
writing a bunch of short methods that take elementtree elements as input
and return appropriate rstgen objects.

Obviously these tools aren't finished, but they produce much better output
than the old version of lore2sphinx w.r.t. whitespace handling, paragraph
wrapping, etc.

Some of the code is still pretty messy, but nowhere near the train wreck
that the current/old version of lore2sphinx is.  By which I mean it _can_
be cleaned up, it just hasn't been yet.  In particular there's some places
in rstgen where the API is (to me at least) obviously awful, but I haven't
gotten around to fixing it yet.

Please review the code.  Please feel free to ask questions if you're
interested.

Personally, I've gotten over being in a hurry about all this, and I think a
robust tool is more likely to succeed in the long run, though finishing it
may make the run a bit longer.  So I'm for finishing lore2sphinx-ng+rstgen.

What are others' opinions?  Make the "old" tool work?  Or make the "new"
tool work?

Damn.  Talk about long emails.

--
Kevin Horn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20130301/e5fe3c1b/attachment-0001.htm