[Twisted-Python] Lore and generating reStructuredText (Lore2Sphinx)

Kevin Horn kevin.horn at gmail.com
Thu Mar 21 11:52:17 EDT 2013


On Thu, Mar 21, 2013 at 9:17 AM, Kevin Horn <kevin.horn at gmail.com> wrote:

> tl;dr  A Lore plugin won't work for generating Sphinx source files, at
> least not by itself.
>
> Itamar posted some notes from the Twisted BoF session that was held at
> PyCon last weekend, and one of the things in it was the following line:
>
> - lore output plugin that generates ReST via docutils parse tree objects, then write code to run sphinx on this output
>
>
> I wasn't there, so I don't know the exact context that this was referring
> to, but let me try to explain a little bit about why this won't work (at
> least not as written).
>
> reStructuredText, as some of you may know, creates it's output by first
> creating an intermediate representation of a document called a "node tree",
> which is a tree of "nodes" which represents the various elements in a
> document (text, paragraphs, lists, list items, etc.).  reStructuredText
> also has a construct called a "directive", which is  some markup which
> tells the docutils reST parser to create a bunch of these nodes.
>
> Directives are awesome and are a big reason why reStructuredText is so
> much more powerful than other lightweight markup languages like markdown,
> textile, etc.  They serve as extension point and allow users to create
> their own markup constructs without changing the actual parser.
>
> The key thing is that a directive is not itself a type of node.  Rather it
> 's just a markup construct.  This means that once a reStructuredText
> document goes through the docutils parser, the information about the
> directives is lost, because they have been transformed into a bunch of
> nodes.
>
> For example there's a container directive, which looks like this:
>
> Title
> =====
>
> .. container::
>
>     I'm a content paragraph!  Yay!
>
> When processed this creates a nodetree that looks something like this (in
> docutils "pseudoxml" representation:
>
> <document ids="title" names="title" source="test.rst" title="Title">
>     <title>
>         Title
>     <container>
>         <paragraph>
>             I'm a content paragraph!  Yay!
>
> It is entirely coincidental that the container directive and the
> <container> node are named the same thing.  Don't let this confuse you.
>  The point is that the directive goes away and is replaced by a  bunch of
> nodes (more specifically, the node tree is transformed in some way...I
> suppose a directive could remove nodes, but I don't think I've ever seen
> that done).
>
> We can see this using another example:
>
> Here's some markup:
>
> Title
> =====
>
> .. warning::
>
>     I'm a content paragraph!  Yay!
>
> and here's the pseudoxml representation of the nodetree:
>
> <document ids="title" names="title" source="test.rst" title="Title">
>     <title>
>         Title
>     <container>
>         <paragraph>
>             I'm a content paragraph!  Yay!
>
> Notice that the node trees look exactly the same.  Now this is not quite
> true, as there's probably some attributes on the actual Python nodes that
> might be used to distinguish them when writing output which aren't
> displayed here...they certainly get rendered into HTML differently.  But
> the point is that the directive itself is GONE and you have no real way of
> recreating it from the node tree.
>
> I think this problem also happens with custom text roles, which is another
> extension mechanism in reST, but I haven't looked too deeply into that.
>
> Since you really, really want to have directives in your output (in fact
> you have to have them if you want to use Sphinx, which makes heavy use of
> them), you can't really generate Sphinx-capable source files using _only_
> the nodetree representation.
>
> I suppose you might be able to do something where you try to detect where
> the directive _should_ go and try to insert it during the rendering step,
> but such a thing would be an egregious kludge,  would take a lot of effort,
> and I can't imagine it would work very well, if at all.
>
> Another option would be to fork the distutils parser and change it so that
> it could create "directive nodes" or something, but I certainly would not
> recommend such a course. (If you think maintaining Lore is a pain, you
> ain't seen nothin' yet.  And one thing this project has driven home to me
> is that no software only needs to be maintained "for a little while".)
>
> I'm not saying that the proposed plugin for lore is a bad idea...I think
> it would be pretty cool.  You'd be able to send lore out to all of the
> various formats supported by docutils, and who doesn't want to write their
> next s5 presentaton in Lore, right? :)  But it won't do the job that it was
> being put forward for in the note Itamar mentioned.
>
> So what about building some software that generates some other
> representation of the source document, and then renders that as
> reStructuredText?  Well this is the best idea I've come up with (or heard)
> and is in fact exactly what lore2sphinx-ng_ (which is not intended to be a
> separate thing, it's just an experimental fork of lore2sphinx) and rstgen_
> do.  lore2spinx-ng creates the representation from lore sources (which is
> also a tree of "nodes", though they aren't called that), and rstgen defines
> the nodes, and renders them into reStructuredText source.
>
> The only problem is that these aren't done yet, though the work done so
> far looks very promising (in terms of actually being able to do the job
> reliably someday).  If anyone has bothered to read this far and is
> interested in helping out, please feel free to fork the repos and lend a
> hand.  Also feel free to contact me either on this list or directly if you
> have any questions.  I apologize in advance for the current state of the
> code, which is a bit messy (especially lore2sphinx-ng, which still has a
> bunch of cruft from the "old"/"current" version that I haven't gotten
> around to removing yet).
>
>
>
> .. _lore2sphinx-ng: https://bitbucket.org/khorn/lore2sphinx-ng
> .. _rstgen:https://bitbucket.org/khorn/rstgen
>
> --
> Kevin Horn
>

I screwed up the example above, due to misnaming a file and running
rst2pseudoxml.py on the wrong thing.

It should actually look something like this:

<document ids="title" names="title" source="test.rst" title="Title">
    <title>
        Title
    <warning>
        <paragraph>
            I'm a content paragraph!  Yay!

and this:

<document ids="title" names="title" source="test.rst" title="Title">
    <title>
        Title
    <admonition classes="admonition-hooray">
        <title>
            hooray!
        <paragraph>
            I'm a content paragraph!  Yay!

But the point still holds.  Directive info goes away after parsing.

--
Kevin Horn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20130321/b1f2db0e/attachment-0001.htm 


More information about the Twisted-Python mailing list