Opened 11 years ago

Closed 11 years ago

#1775 enhancement closed fixed (fixed)

Attach URI to a resource during lookup

Reported by: Wilfredo Sánchez Vega Owned by:
Priority: high Milestone:
Component: web2 Keywords:
Cc: Wilfredo Sánchez Vega, David Reid, Jean-Paul Calderone Branch:


locateChild() should attach an attribute to the resource object it returns which contains the URI at which the resource was found.

Currently, it is quite difficult to know what the URI of a resource is.

Change History (16)

comment:1 Changed 11 years ago by Glyph

I disagree. We avoided such an API in web1 on purpose. Perhaps you can attach an attribute to the request, but there's already enough information there to compute it (see nevow.url).

Here are some unresolved questions from that attempt:

  • What if the same resource is present at multiple URLs? Then you have to add to a list.
  • What if it is present at a URL temporarily? Then overwriting is correct.
  • What if it is present at an arbitrary number of computed URLs? Then adding to a list is correct, but a possible severe efficiency problem.
  • What about security? Some URLs are intended to be secret for security purposes and having them attach to a resource object is not a good idea.
  • How do you properly generate an e-mail notification for a user to view that URL? How will it know which URL to choose if it's present at multiple URLs? If the resource hasn't been viewed yet, it will be wrong, but then later it will magically be right when someone hits that page in that server run.

I believe when we abandoned trying to find a solution for web1/nevow we were converging this: only statically-generated resource trees would have URLs associated with them, that the API would be optional, and that it would happen at putChild time rather than at getChild time. There was another idea that it would attempt to adapt each resource to IURLAware as the resource hierarchy traversal went along, but I think that has many of the same problems I just mentioned.

comment:2 Changed 11 years ago by David Reid

I think an important point that is missing from the description of this ticket is that the functionality trying to be acheived is not to find the uri of the current resource, but to find the uri of a resource that could exist at any point in the tree.

The actual use case is for WebDAV ACL Principals (A resource at a uri that represents a user and contains certain properties necessary for authentication and authorization) Unfortunately we need a method that allows you to traverse _all_ (including non-dav) resources on the way to locating the principal resource and figuring out it's URI.

comment:3 Changed 11 years ago by Wilfredo Sánchez Vega


  • If a resource is at multiple URLs, then the URL you get back will be the one it was last looked up as. For my purposes, that's good enough; the URL you get will be a correct one, even if it might not be the only one, and even if it's different later.
  • If a resource is at a URL temporarily, then that's maybe an argument to hang this information on the request instead of the resource, assuming that the resource will live beyond the request. I'm not sure what use case you have in mind... On the other hand, if it's later at another URL, then it should be found at that URL next time it is looked up, so I don't know if that's a problem in any case
  • If there are an arbitrary number of computed URLs, only those by which is it looked up by would be registered, and in fact, I was only planning to register the last one, not a list.
  • I'm having a hard time buying the security concern. You're saying the server is keeping secrets from itself? The presence of an attribute containing the URI doesn't necessarily mean that the URI will be exposed to the client.
  • Looking for the resource's URL is only valid in the context of a request. I am not proposing that we provide access to every URL that a resource might be at; simply a valid URL for the resource on the server.

comment:4 Changed 11 years ago by Wilfredo Sánchez Vega

It seems like some of these multiple URI problems can be alleviated by sticking this information in the request instead of on the resource, which is potentially long-lived. We might, for example, use a weak ref dictionary to relate resources and URIs in the context of the current request.

comment:5 Changed 11 years ago by Wilfredo Sánchez Vega

Created branch remember-uris-1775 for this issue.

comment:6 Changed 11 years ago by Wilfredo Sánchez Vega

Priority: normalhigh

comment:7 Changed 11 years ago by Jean-Paul Calderone

Hierarchy traversal is strictly ordered and unbranching. How about storing this information in a simple list? eg, request.mumbleParentsMumble might look like:

    [('', rootResource),
     ('foo', resourceAtSlashFoo),
     ('bar', resourceAtSlashFooSlashBar)]

and be built up as children are traversed.

comment:8 Changed 11 years ago by Wilfredo Sánchez Vega

So my purposes, the lookup would he harder here. I want, given a resource, it's URI. I'd have to search this whole tree to find the resource I want.

comment:9 Changed 11 years ago by Wilfredo Sánchez Vega

I've committed an implementation on the branch using a WeakKeyDictionary. request.urlForResource(resource) should be able to give me a valid URI if the resource was found by locating it via the request.

comment:10 Changed 11 years ago by Wilfredo Sánchez Vega

Keywords: review added

Added test cases: twisted.web2.test.test_server.RememberURIs

comment:11 Changed 11 years ago by Wilfredo Sánchez Vega

Ready for review on branch remember-uris-1775

comment:12 Changed 11 years ago by Wilfredo Sánchez Vega

Owner: changed from Wilfredo Sánchez Vega to David Reid

comment:13 Changed 11 years ago by David Reid

Keywords: review removed
Owner: changed from David Reid to Wilfredo Sánchez Vega

twisted.web2.test.test_server.RememberURIs.test_requestedResource You should use the mechanisms already available for testing through the chanRequest (i.e. self.getResourceFor) and then just use a reference to the resource to verify that you can find it from the request in the body of your test case. - Blocker

+        # Request hostname and destination hostname have to be the same.

Good job keeping locateResource from traversing a NameVirtualHost, that would have been bad. I'd update the comment to explain why this is necessary it was not immediately apparent to me why it was doing it, and then we can make sure it doesn't mysteriously disappear and cause a gaping security hole. Also a unittest that exercises this would be a good idea. - Blocker

+        segments = path.split("/")
+        if segments[0] != "":
+            raise AssertionError("URL path didn't begin with '/': %s" % (path,))

Is that even possible? - WTF?

And last, a probably premature optimization, if we're requesting a very deep resource (or just have a lot of '/' in the path as mentioned in #1570) would it be beneficial to attempt to use self._resourcesByURL as a cache for locateResource? - Idea

comment:14 Changed 11 years ago by Wilfredo Sánchez Vega

  • Fixed the test to use getResourceFor()
  • Clarified the request host matching and the segments thing. These are both assertions to make sure the code is sane. Neither should ever happen. The code is a little clearer in its intent now, I think.
  • You'd want to invert the keys and values in self._resourcesByURL dictionary to use if as a location cache. I was deliberately avoiding caching objects that aren't specifically looked up just to keep the dictionary smaller. Also, we are using weak references in order to avoid keeping objects in memory unless someone else is holding on to them. (If you don't have a reference to it, you can't ask the request for it's URI anyway.) So a location cache would also want to use a regular dictionary with strong references. I don't know if whether want to keep those objects in memory. At the very least, that seems like a different issue, though I agree a similar scheme would work.

comment:15 Changed 11 years ago by Wilfredo Sánchez Vega

Resolution: fixed
Status: newclosed

comment:16 Changed 6 years ago by <automation>

Owner: Wilfredo Sánchez Vega deleted
Note: See TracTickets for help on using tickets.