Opened 3 years ago

Last modified 14 months ago

#5379 enhancement new

Create new Resource traversal model

Reported by: itamar Owned by: acapnotic
Priority: normal Milestone:
Component: web Keywords:
Cc: exarkun, glyph, zooko@…, keturn@…, jonathanj, christian@…, nejucomo@… Branch: branches/new-resource-5379-2
(diff, github, buildbot, log)
Author: itamarst Launchpad Bug:

Description (last modified by itamar)

This is step 1 from WebResourceImprovements.

New resource model allowing consuming multiple path segments. In particular, return a Response object from render that has response code, headers, and IBodyProducer. The new resource will be default for twisted.web, and old resources will have backwards compatibility layer.

This will have the minimal possible feature set; fancier features like render_* dispatching, OPTIONS support, etc. can be done as separate tickets.

Change History (35)

comment:1 Changed 3 years ago by itamarst

  • Author set to itamarst
  • Branch set to branches/new-resource-5379

(In [33109]) Branching to 'new-resource-5379'

comment:2 Changed 3 years ago by itamar

  • Owner set to itamar

comment:3 Changed 3 years ago by itamar

  • Cc exarkun glyph added

Here is another sketch, a result of long discussion on IRC the gist of which is that the API should involve immutable path objects if at all possible:

class DateBasedThing(Resource):
    def traverseChild(self, traversal):
        y = traversal
        m = y.next()
        d = m.next()
        return d.use(SomeDateThing(y.segment, m.segment, d.segment))

comment:4 Changed 3 years ago by itamar

  • Description modified (diff)
  • Summary changed from Create new Resource model based on web2's model to Create new Resource traversal model

comment:5 Changed 3 years ago by itamar

  • Description modified (diff)

comment:6 Changed 3 years ago by itamar

I've fleshed out the proposed API into a full-fledged code skeleton, enough to make the implementation and usage fairly obvious. I'd appreciate some comments at this point.

comment:7 Changed 3 years ago by glyph

Even if you don't think that this is ready to go, i think it might be fair for you to put the ticket into review - enough work has certainly gone into it that you deserve the points ;).

comment:8 Changed 3 years ago by itamar

There's no actual code written, just a non-functional skeleton. I suppose I could ask for design review, but I can also do that just by posting a comment here :)

comment:9 follow-up: Changed 3 years ago by itamar

The final API we've ended up with is ... just like locateChild, except much more verbose. I'm not sure it's worth the verbosity, and I'm tempted to abandon it for the locateChild API with some cleanups. Disagreements?

comment:10 in reply to: ↑ 9 Changed 3 years ago by glyph

Replying to itamar:

The final API we've ended up with is ... just like locateChild, except much more verbose. I'm not sure it's worth the verbosity, and I'm tempted to abandon it for the locateChild API with some cleanups. Disagreements?

I certainly disagree pretty strongly.

For one thing, locateChild doesn't have traversal history, which this does. Adding traversal history to locateChild involves a bunch of wacky heuristics and strange edge cases, because locateChild's interface may implicitly change the URI being traversed at any time by adding or removing segments which weren't in the URL in the first place, whereas the interface as described here just lets you traverse the URL as presented by the request. Based on experience with the locateChild API, it's subtle and easy to forget to do the right thing, like accidentally returning a resource instead of a 2-tuple, or reversing the order of the 2-tuple, or returning the same path over and over again (which can then turn into a confusing infinite loop). Finally, this adds a nice, discrete, top-level API for kicking off a traversal, which (while you could of course have that with something locateChild-based) would be an additional thing on top of locateChild.

But, most compelling to me, is to consider the trivial default use-case, which is that you want to look up a resource with one key, and return a resulting resource. Here's how you do it with each API:

def locateChild(self, request, segments):
    return (lookup(segments[0]), segments[1:])
def traverse(self, request, path):
    return path.traverseUsing(lookup(path.segmentName))

They can each be expressed as a 1-liner, but despite the fact that I don't like the names at all in the traverse example, it took me only a second to write it and it was very obvious to me that it was correct; I had to think about the locateChild version for a minute, and it's much more subtle. Now, let's throw a Deferred into the mix:

def locateChild(self, request, segments):
    return lookup(segments[0]).addCallback(lambda rsrc: (rsrc, segments[1:]))
def traverse(self, request, path):
    return lookup(path.segmentName).addCallback(path.traverseUsing)

... or, better yet, make path.traverseUsing support a Deferred as input, and magically, the "synchronous" version of traverse presented above is still correct for both cases! (If you want to fix locateChild in this way, you hopelessly pollute the interface: it's probably bad to have a result which is a 2-tuple of a Deferred and a path or a Deferred which fires with a 2-tuple of a Resource and a path, because what's next? A Deferred firing a 2-tuple of 2 Deferreds? et cetera.)

If your request wants to inspect the current state of the traversal (to generate URIs to sibling resources, perhaps), locateChild still depends on the highly ill-conceived request.prepath and request.postpath attributes, not to mention some other shenanigans to make these actually line up (see _initialprepath if you dare). There's no separate notion of the traversal, just a couple of segments you haven't consumed yet.

(Also, I'm curious, what motivated the weirdness with INDEX? It looks like "" would do that job just fine.)

In conclusion: yes, this API is already a lot better than locateChild, and it can be even better than it is if we can maybe come up with a few better names, and add integration with a real structured URI class. I would be very disappointed if we reverted to the earlier API.

comment:11 follow-up: Changed 3 years ago by itamar

Well, one could have a version of locateChild that requires all returned segment tuples to be sub-pathes of the current path, and then provide a traversal history and API. So that doesn't seem like a strong argument for current API, just a strong argument for a better locateChild implementation.

It's true though that for single segment case and Deferreds the new API is better. And probably I can come up with a way to multi-segment consumption less verbose, and you're right that path.traverseUsing(aDeferredResource) is actually pretty nice.

I want INDEX because, in the HTTP context, that's what "" tends to mean -- "/foo/bar/" is usually rendered by showing index.html or looking up index method on bar resource or something. getChild("") or the equivalent in the new API is just stupid.

As far as URIs - there's a ticket for that, you can work on that in parallel if you'd like :)

comment:12 in reply to: ↑ 11 Changed 3 years ago by glyph

Replying to itamar:

Well, one could have a version of locateChild that requires all returned segment tuples to be sub-pathes of the current path, and then provide a traversal history and API. So that doesn't seem like a strong argument for current API, just a strong argument for a better locateChild implementation.

Right, but that's a whole additional dimension of errors for which there would have to be machinery to check on in locateChild and is just implicit (you can't express the error in the first place) in this new API. It's possible to do all the same stuff with locateChild (and in fact we have); it would just be nice to make it easier to express the right thing and harder to express the wrong thing as long as we're not going to make it 100% compatible anyway.

It's true though that for single segment case and Deferreds the new API is better. And probably I can come up with a way to multi-segment consumption less verbose, and you're right that path.traverseUsing(aDeferredResource) is actually pretty nice.

For the yyyy/mm/dd case, I still kinda like the idea of 'return DateThing(map(int, traversal.consume(3)))', but I haven't figured out how to do anything similar without requiring that the traversal be mutable. (I still don't particularly have a problem with that, as we don't necessarily need to pass the same traversal to every resource.)

Possibly we should write down other multi-segment-consumption cases. Is there any variable-width number of cases. That one seems to be the only one we ever really talk about.

I want INDEX because, in the HTTP context, that's what "" tends to mean -- "/foo/bar/" is usually rendered by showing index.html or looking up index method on bar resource or something. getChild("") or the equivalent in the new API is just stupid.

Okay, so "INDEX = "? The extra code really doesn't seem to do anything, and while '' isn't the clearest possible thing to see when debugging, it's still substantially clearer than <object object at 0x10f7a9090>.

As far as URIs - there's a ticket for that, you can work on that in parallel if you'd like :)

Yeah perhaps :).

I'm quite glad you posted your temptation to abandon this, because you prompted me to write up a bunch of things that I really should have written down in the first place rather than just describing in person.

comment:13 follow-up: Changed 3 years ago by itamar

The code is not done, which is why repr(INDEX) is so lame. Probably I will either make INDEX a singleton of a subclass of unicode, or maybe just equal to u''; regardless I want people to use a symbolic constant instead of == u''.

How about Path.descend(depth), which returns a pair of (consumed segments tuple, remaining Path)? Doesn't allow 1-liners, but still less verbose than current API.

comment:14 in reply to: ↑ 13 Changed 3 years ago by glyph

Replying to itamar:

The code is not done, which is why repr(INDEX) is so lame. Probably I will either make INDEX a singleton of a subclass of unicode, or maybe just equal to u''; regardless I want people to use a symbolic constant instead of == u''.

If it's going to be handled exactly like u"" absolutely everywhere, then why not just document it as a symbolic constant (I do like that idea, don't get me wrong), but not special-case it any further than that?

Maybe I'm just weird, but 'if not segment' makes total sense to me for index detection; despite being tripped up numerous times by nuances of locateChild, that one has just never been a problem for me.

How about Path.descend(depth), which returns a pair of (consumed segments tuple, remaining Path)? Doesn't allow 1-liners, but still less verbose than current API.

That's a good start.

Certainly 1-liners aren't a hard requirement, I just think they're indicative of unnecessary additional complexity. (But I can also see the benefits of avoiding mutable state in the request, so I don't want to push too hard in that direction. Certainly I'd rather have a little more verbosity in this go-round of the API than another pile of shared mutable state working its way through the request and making its behavior non-deterministic and hard to replicate for testing.)

comment:15 Changed 2 years ago by zooko

  • Cc zooko@… added

I'm interested in this ticket because itamar said after he does this one, he'll work on #288 which is a prerequisite for https://tahoe-lafs.org/trac/tahoe-lafs/ticket/320 . Is the state of this ticket that the ball is in itamar's court?

comment:16 Changed 2 years ago by glyph

As always, the ball is in the community's court :). Specifically on this ticket though, itamar's code is checked into the branch, and the comments on this ticket pretty well reflect my current thinking on the issue. If you wanted to take the ball a little further down the field I'm sure that nobody would object.

comment:17 Changed 2 years ago by glyph

Sorry, got too metaphorical there: please feel free to do some commits on the branch to address some of the feedback here. I'm sure Itamar will appreciate the help when he next has some time to look at it. Also, feel free to add your own perspective to the potential API, and let me know if my previous comments make sense to you.

comment:18 Changed 2 years ago by EDevil

  • Cc andre.cruz@… added

comment:19 Changed 18 months ago by acapnotic

  • Cc keturn@… added
  • Component changed from core to web

Nudging this along a little bit at https://github.com/keturn/twisted/tree/new-resource-5379-2/twisted/web

You mentioned the need for traversal history. Does that mean that the child returned by Path.descend() should have a pointer of some kind to its parent?

comment:20 Changed 18 months ago by acapnotic

well, I implemented tests that were skeletons, and made tests pass.

There's still _newresource.traverse(request, path, rootResource) with no implementation, and while there is a Resource.traverse implementation with some non-empty test that passes, but I expect it ought to be doing more.

comment:21 Changed 18 months ago by acapnotic

  • Keywords review added

Okay, I implemented newresource.traverse:

https://github.com/keturn/twisted/compare/trunk...new-resource-5379-2

Some notes:

  • The use of the verb "traverse" in both Resource.traverse(path) and Path.traverseUsing(resource) was hard for me to wrap my head around.
  • Should Path.traverseUsing return a Deferred(path, resource) instead of a (path, Deferred(resource))?
  • Is newresource.traverse() ever called with a Path(()), or does a request always have at least Path((INDEX,))?

comment:22 follow-up: Changed 18 months ago by itamar

I've come up with a potential different API, and dreid wants me to look at webmachine, so I suggest pausing at this point on this ticket; I'll add some notes soon.

comment:23 in reply to: ↑ 22 Changed 18 months ago by glyph

Replying to itamar:

I've come up with a potential different API, and dreid wants me to look at webmachine, so I suggest pausing at this point on this ticket; I'll add some notes soon.

You are welcome to add notes, but I object to the notion of "pausing" on this ticket. Given that this is blocking a ticket filed almost 10 years ago, if somebody wants to review Kevin's contribution here and finds it acceptable, then we should move forward. You have had a decade to propose a potential different API :).

comment:24 Changed 17 months ago by jonathanj

  • Cc jonathanj added

comment:25 Changed 17 months ago by glyph

Let the record reflect that it has now been 2 weeks since itamar said he would add some notes "soon".

It is always a bad idea to pause work on any ticket, if someone has time to work on it :).

comment:26 Changed 17 months ago by itamar

I guess I agree that we should just finish the thing, perfect or not.

comment:27 Changed 17 months ago by glyph

Nobody's done any commits in the last two weeks either, despite my exhortation. So the other lesson is that you don't need to tell people to pause work, you can just attach comments at your leisure, since work has a tendency to pause on its own ;).

comment:28 Changed 16 months ago by chris-

  • Cc christian@… added

comment:29 Changed 16 months ago by EDevil

  • Cc andre.cruz@… removed

comment:30 Changed 16 months ago by glyph

I really want to review this branch, but I can't, because git is a pile of crap.

Kevin decided to make some commits to github, so I am trying to import those commits into SVN so I can force builds and such.

However, after following the instructions, git svn fetch (eventually) fails like this:

Filesystem has no item: Working copy path 'branches/glyph' does not exist in repository at /usr/libexec/git-core/git-svn line 5670

So I don't have the required revisions for the history of Kevin's branch, so I can't dcommit it.

Squashing the whole thing produces a bunch of conflicts, because only half the branch's commits are in svn.

I will keep you all posted of any progress.

comment:31 Changed 16 months ago by glyph

  • Author changed from itamarst to itamarst, glyph
  • Branch changed from branches/new-resource-5379 to branches/new-resource-5379-2

(In [38348]) Branching to 'new-resource-5379-2'

comment:32 Changed 16 months ago by glyph

  • Author changed from itamarst, glyph to itamarst

Removing myself as author since I just cherry-picked those commits into a new branch.

comment:33 Changed 16 months ago by glyph

Thanks for pushing this forward.

Got a few initial review comments:

  1. Please don't ever make me interact with git-svn again. Trying to get these commits into svn was miserable. If you're a committer, please make getting the commits into an SVN branch your own responsibility. If you don't have commit it's OK to leave your code on github - maybe even slightly nicer than submitting a patch - but if you do then please figure out a workflow that leaves some build results available.
  2. Speaking of which, here are some build results.
  3. Within which, there are some pyflakes errors.
  4. As well as some new epytext reference errors (although I'm not entirely sure those are this branch's fault?)
  5. And also some new twistedchecker errors.

Hopefully some more substantive ones in a moment.

comment:34 Changed 16 months ago by glyph

  • Keywords review removed
  • Owner changed from itamar to acapnotic
  1. About half of the test docstrings are in the wrong style, starting the text on the same line as the """.
  2. The "XXX" question posed by test_traverseEmptyPath ought to be deleted. As to answering that question: the path to / definitely does result in a path containing INDEX, but that doesn't make it an error to call traverse with Path.leaf(). One of the main goals here is to have an API that can be used as an API, rather than just happening to work when invoked from HTTP parsing internals.
  3. Path.segments ought to be a list, not a tuple. I realize that immutability is alluring, but tuples are really for heterogenous data, and lists are for homogenous data.
  4. The new IResource.traverse should be a lot clearer in its purpose. What "remaining" path segments? What sort of object?
  5. Remember that thing I said about tuples being for heterogenous data? Well, forget that: tuples are just a crappy data structure. I was going to say: "I think that everywhere that (path, resource) appears, there should be a specific type; some kind of namedtuple, perhaps, that represents that sort of value." But... it looks like you've already done that, with _TraversalStep, it just has a crappy repr and a private constructor. Make this part of the public API, and make it a named tuple so you can access attributes rather than indexes.
  6. inlineCallbacks makes things unnecessarily opaque. For example: how can we debug what step of a traversal is "stuck"? It's hard to see inside the generator without adding print statements to it. i think it would be nicer to have an object that does the traversal in discrete, inspectable steps, so we can add methods to it for interrogating its state.
  7. I realize that this is a draft of a new API, but it deserves a draft of some new narrative documentation. Maybe just add some that isn't linked from the index, with a "note" up at the top warning that the API's not public yet?
  8. fromString is a bad method name. fromBytes, fromText. Also, it needs a @return, @rtype, @param...
  9. Consider some type-checking on Path.descend (and, thereby, Path.child). It would be nice to get early encoding errors rather than waiting until something happened with the path.
  10. implements(IBodyProducer) needs to change to be implements(IProducer). On that note, since this is totally new code, please add it to the python 3 tests and make sure it passes there.
  11. setBody should refer to the type it takes as bytes. Except, it shouldn't take bytes. If you want to have a convenience method, make it a convenience method; the type of body really ought to be consistent. Also, why bother having a set method, when you could just make it a property? In fact, I don't think you ought to be able to set the body after the fact: "or None", especially on an attribute, is a code smell.
  12. As to the question about Deferred(path, resource) vs. (path, Deferred(resource)); having Deferreds as attributes of things is kind of bad. You always want to have a method that vends a new Deferred to anyone who asks. I'd be happier to go with a Deferred firing some value. (Is it even really possible to know the right path value to return synchronously?)
  13. traverseUsing's @param ought to be a @type (with a @param that describes the _role_ of the value, not its type); it also needs an @return and an @rtype.
  14. I think you're right that traverseUsing is a bit weird, name wise. Hrm. If a resource is the thing which traverses a path, and in so doing, produces a new resource... then a path is a thing which is traversed. How would you feel about changing traverseUsing to asTraversedBy?
  15. Speaking of names: trees have leaves; paths have ends, or beginnings, or something.

Again, thanks a ton for keeping this going. I will endeavor to be more responsive to updates to this ticket; if you want to have some discussions online, feel free to ping me and we can discuss it some more.

Regarding point 19... I also want an API that calls descend or child as well as traverseUsing (or whatever we decide to call it) that looks something like this:

def traverse(self, request, path):
    return path.descent(3, lambda year, month, day: DateResource(year, month, day))
def traverse(self, request, path):
    return path.withNextSegment(lambda segmentName: ChildResource(segmentName))

as you may recognize that that could be simplified to:

def traverse(self, request, path):
    return path.descent(3, DateResource)
def traverse(self, request, path):
    return path.withNextSegment(ChildResource)

Maybe those should be functions instead of methods; maybe that's a job for a separate ticket. Anyway, keep up the good work!

comment:35 Changed 14 months ago by nejucomo

  • Cc nejucomo@… added
Note: See TracTickets for help on using tickets.