[Twisted-Python] Lore and generating reStructuredText (Lore2Sphinx)

Kevin Horn kevin.horn at gmail.com
Thu Mar 21 08:17:01 MDT 2013


tl;dr  A Lore plugin won't work for generating Sphinx source files, at
least not by itself.

Itamar posted some notes from the Twisted BoF session that was held at
PyCon last weekend, and one of the things in it was the following line:

- lore output plugin that generates ReST via docutils parse tree
objects, then write code to run sphinx on this output


I wasn't there, so I don't know the exact context that this was referring
to, but let me try to explain a little bit about why this won't work (at
least not as written).

reStructuredText, as some of you may know, creates it's output by first
creating an intermediate representation of a document called a "node tree",
which is a tree of "nodes" which represents the various elements in a
document (text, paragraphs, lists, list items, etc.).  reStructuredText
also has a construct called a "directive", which is  some markup which
tells the docutils reST parser to create a bunch of these nodes.

Directives are awesome and are a big reason why reStructuredText is so much
more powerful than other lightweight markup languages like markdown,
textile, etc.  They serve as extension point and allow users to create
their own markup constructs without changing the actual parser.

The key thing is that a directive is not itself a type of node.  Rather it
's just a markup construct.  This means that once a reStructuredText
document goes through the docutils parser, the information about the
directives is lost, because they have been transformed into a bunch of
nodes.

For example there's a container directive, which looks like this:

Title
=====

.. container::

    I'm a content paragraph!  Yay!

When processed this creates a nodetree that looks something like this (in
docutils "pseudoxml" representation:

<document ids="title" names="title" source="test.rst" title="Title">
    <title>
        Title
    <container>
        <paragraph>
            I'm a content paragraph!  Yay!

It is entirely coincidental that the container directive and the
<container> node are named the same thing.  Don't let this confuse you.
 The point is that the directive goes away and is replaced by a  bunch of
nodes (more specifically, the node tree is transformed in some way...I
suppose a directive could remove nodes, but I don't think I've ever seen
that done).

We can see this using another example:

Here's some markup:

Title
=====

.. warning::

    I'm a content paragraph!  Yay!

and here's the pseudoxml representation of the nodetree:

<document ids="title" names="title" source="test.rst" title="Title">
    <title>
        Title
    <container>
        <paragraph>
            I'm a content paragraph!  Yay!

Notice that the node trees look exactly the same.  Now this is not quite
true, as there's probably some attributes on the actual Python nodes that
might be used to distinguish them when writing output which aren't
displayed here...they certainly get rendered into HTML differently.  But
the point is that the directive itself is GONE and you have no real way of
recreating it from the node tree.

I think this problem also happens with custom text roles, which is another
extension mechanism in reST, but I haven't looked too deeply into that.

Since you really, really want to have directives in your output (in fact
you have to have them if you want to use Sphinx, which makes heavy use of
them), you can't really generate Sphinx-capable source files using _only_
the nodetree representation.

I suppose you might be able to do something where you try to detect where
the directive _should_ go and try to insert it during the rendering step,
but such a thing would be an egregious kludge,  would take a lot of effort,
and I can't imagine it would work very well, if at all.

Another option would be to fork the distutils parser and change it so that
it could create "directive nodes" or something, but I certainly would not
recommend such a course. (If you think maintaining Lore is a pain, you
ain't seen nothin' yet.  And one thing this project has driven home to me
is that no software only needs to be maintained "for a little while".)

I'm not saying that the proposed plugin for lore is a bad idea...I think it
would be pretty cool.  You'd be able to send lore out to all of the various
formats supported by docutils, and who doesn't want to write their next s5
presentaton in Lore, right? :)  But it won't do the job that it was being
put forward for in the note Itamar mentioned.

So what about building some software that generates some other
representation of the source document, and then renders that as
reStructuredText?  Well this is the best idea I've come up with (or heard)
and is in fact exactly what lore2sphinx-ng_ (which is not intended to be a
separate thing, it's just an experimental fork of lore2sphinx) and rstgen_
do.  lore2spinx-ng creates the representation from lore sources (which is
also a tree of "nodes", though they aren't called that), and rstgen defines
the nodes, and renders them into reStructuredText source.

The only problem is that these aren't done yet, though the work done so far
looks very promising (in terms of actually being able to do the job
reliably someday).  If anyone has bothered to read this far and is
interested in helping out, please feel free to fork the repos and lend a
hand.  Also feel free to contact me either on this list or directly if you
have any questions.  I apologize in advance for the current state of the
code, which is a bit messy (especially lore2sphinx-ng, which still has a
bunch of cruft from the "old"/"current" version that I haven't gotten
around to removing yet).



.. _lore2sphinx-ng: https://bitbucket.org/khorn/lore2sphinx-ng
.. _rstgen:https://bitbucket.org/khorn/rstgen

--
Kevin Horn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20130321/77ad184b/attachment.html>


More information about the Twisted-Python mailing list