Opened 8 years ago

Closed 8 years ago

#1775 enhancement closed fixed (fixed)

Attach URI to a resource during lookup

Reported by: wsanchez Owned by:
Priority: high Milestone:
Component: web2 Keywords:
Cc: wsanchez, dreid, exarkun Branch:
Author: Launchpad Bug:

Description

locateChild() should attach an attribute to the resource object it returns which contains the URI at which the resource was found.

Currently, it is quite difficult to know what the URI of a resource is.

Change History (16)

comment:1 Changed 8 years ago by glyph

I disagree. We avoided such an API in web1 on purpose. Perhaps you can attach an attribute to the request, but there's already enough information there to compute it (see nevow.url).

Here are some unresolved questions from that attempt:

  • What if the same resource is present at multiple URLs? Then you have to add to a list.
  • What if it is present at a URL temporarily? Then overwriting is correct.
  • What if it is present at an arbitrary number of computed URLs? Then adding to a list is correct, but a possible severe efficiency problem.
  • What about security? Some URLs are intended to be secret for security purposes and having them attach to a resource object is not a good idea.
  • How do you properly generate an e-mail notification for a user to view that URL? How will it know which URL to choose if it's present at multiple URLs? If the resource hasn't been viewed yet, it will be wrong, but then later it will magically be right when someone hits that page in that server run.

I believe when we abandoned trying to find a solution for web1/nevow we were converging this: only statically-generated resource trees would have URLs associated with them, that the API would be optional, and that it would happen at putChild time rather than at getChild time. There was another idea that it would attempt to adapt each resource to IURLAware as the resource hierarchy traversal went along, but I think that has many of the same problems I just mentioned.

comment:2 Changed 8 years ago by dreid

I think an important point that is missing from the description of this ticket is that the functionality trying to be acheived is not to find the uri of the current resource, but to find the uri of a resource that could exist at any point in the tree.

The actual use case is for WebDAV ACL Principals (A resource at a uri that represents a user and contains certain properties necessary for authentication and authorization) Unfortunately we need a method that allows you to traverse _all_ (including non-dav) resources on the way to locating the principal resource and figuring out it's URI.

comment:3 Changed 8 years ago by wsanchez

glyph:

  • If a resource is at multiple URLs, then the URL you get back will be the one it was last looked up as. For my purposes, that's good enough; the URL you get will be a correct one, even if it might not be the only one, and even if it's different later.
  • If a resource is at a URL temporarily, then that's maybe an argument to hang this information on the request instead of the resource, assuming that the resource will live beyond the request. I'm not sure what use case you have in mind... On the other hand, if it's later at another URL, then it should be found at that URL next time it is looked up, so I don't know if that's a problem in any case
  • If there are an arbitrary number of computed URLs, only those by which is it looked up by would be registered, and in fact, I was only planning to register the last one, not a list.
  • I'm having a hard time buying the security concern. You're saying the server is keeping secrets from itself? The presence of an attribute containing the URI doesn't necessarily mean that the URI will be exposed to the client.
  • Looking for the resource's URL is only valid in the context of a request. I am not proposing that we provide access to every URL that a resource might be at; simply a valid URL for the resource on the server.

comment:4 Changed 8 years ago by wsanchez

It seems like some of these multiple URI problems can be alleviated by sticking this information in the request instead of on the resource, which is potentially long-lived. We might, for example, use a weak ref dictionary to relate resources and URIs in the context of the current request.

comment:5 Changed 8 years ago by wsanchez

Created branch remember-uris-1775 for this issue.

comment:6 Changed 8 years ago by wsanchez

  • Priority changed from normal to high

comment:7 Changed 8 years ago by exarkun

Hierarchy traversal is strictly ordered and unbranching. How about storing this information in a simple list? eg, request.mumbleParentsMumble might look like:

    [('', rootResource),
     ('foo', resourceAtSlashFoo),
     ('bar', resourceAtSlashFooSlashBar)]

and be built up as children are traversed.

comment:8 Changed 8 years ago by wsanchez

So my purposes, the lookup would he harder here. I want, given a resource, it's URI. I'd have to search this whole tree to find the resource I want.

comment:9 Changed 8 years ago by wsanchez

I've committed an implementation on the branch using a WeakKeyDictionary.
request.urlForResource(resource) should be able to give me a valid URI if the resource was found by locating it via the request.

comment:10 Changed 8 years ago by wsanchez

  • Keywords review added

Added test cases: twisted.web2.test.test_server.RememberURIs

comment:11 Changed 8 years ago by wsanchez

Ready for review on branch remember-uris-1775

comment:12 Changed 8 years ago by wsanchez

  • Owner changed from wsanchez to dreid

comment:13 Changed 8 years ago by dreid

  • Keywords review removed
  • Owner changed from dreid to wsanchez

twisted.web2.test.test_server.RememberURIs.test_requestedResource You should use the mechanisms already available for testing through the chanRequest (i.e. self.getResourceFor) and then just use a reference to the resource to verify that you can find it from the request in the body of your test case. - Blocker

+        # Request hostname and destination hostname have to be the same.

Good job keeping locateResource from traversing a NameVirtualHost, that would have been bad. I'd update the comment to explain why this is necessary it was not immediately apparent to me why it was doing it, and then we can make sure it doesn't mysteriously disappear and cause a gaping security hole. Also a unittest that exercises this would be a good idea. - Blocker

+        segments = path.split("/")
+        if segments[0] != "":
+            raise AssertionError("URL path didn't begin with '/': %s" % (path,))

Is that even possible? - WTF?

And last, a probably premature optimization, if we're requesting a very deep resource (or just have a lot of '/' in the path as mentioned in #1570) would it be beneficial to attempt to use self._resourcesByURL as a cache for locateResource? - Idea

comment:14 Changed 8 years ago by wsanchez

  • Fixed the test to use getResourceFor()
  • Clarified the request host matching and the segments thing. These are both assertions to make sure the code is sane. Neither should ever happen. The code is a little clearer in its intent now, I think.
  • You'd want to invert the keys and values in self._resourcesByURL dictionary to use if as a location cache. I was deliberately avoiding caching objects that aren't specifically looked up just to keep the dictionary smaller. Also, we are using weak references in order to avoid keeping objects in memory unless someone else is holding on to them. (If you don't have a reference to it, you can't ask the request for it's URI anyway.) So a location cache would also want to use a regular dictionary with strong references. I don't know if whether want to keep those objects in memory. At the very least, that seems like a different issue, though I agree a similar scheme would work.


comment:15 Changed 8 years ago by wsanchez

  • Resolution set to fixed
  • Status changed from new to closed

comment:16 Changed 4 years ago by <automation>

  • Owner wsanchez deleted
Note: See TracTickets for help on using tickets.