Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#2235 defect closed duplicate (duplicate)

WebDAV copy fails if the name of the destination folder contains whitespace

Reported by: vincentk Owned by:
Priority: normal Milestone:
Component: web2 Keywords:
Cc: wsanchez, exarkun Branch:
Author: Launchpad Bug:

Description

DESCRIPTION:

If the destination uri contains whitespace,
d = request.locateResource(destination_uri)
in twisted.web2.dav.method.copymove.prepareForCopy
returns None, in which case
twisted.web2.dav.method.copymove.prepareForCopy._prepareForCopy
raises an HTTPError(StatusResponse(responsecode.CONFLICT, "No parent collection.").

ANALYSIS:
Request.locateResource() in twisted/web2/server.py accepts a URI
and returns the corresponding resource if one exists. However, it does not unquote the path component of the URI, and correspondingly fails if the path contains quoted characters.

FIX:
replace
segments = path.split("/")
with
segments = unquote(path).split("/")

Attachments (1)

server_patch.txt (957 bytes) - added by vincentk 8 years ago.
patch consisting of the above unquote, as well as the removal of a nearby unused inner function

Download all attachments as: .zip

Change History (10)

Changed 8 years ago by vincentk

patch consisting of the above unquote, as well as the removal of a nearby unused inner function

comment:1 Changed 8 years ago by exarkun

  • Cc wsanchez exarkun added
  • Keywords review removed

I think unquoting like this is incorrect. For example:

>>> urllib.unquote('foo%2fbar/baz')
'foo/bar/baz'

I'm not very familiar with this code, but I wonder what the point of all the URL parsing code in _parseURL is if locateResource just goes and misparses an URL on its own.

In any case, this looks like it needs a ton more test coverage before any changes are made.

comment:2 Changed 8 years ago by vincentk

  • Priority changed from highest to normal

well, it seems like _parseURL operates on members of the enclosing Request object, and is therefore perhaps only marginally suitable to simply parse the destination url. maybe it would be useful to provide a function with the functionality of _parseURL, but the signature of urlsplit, i.e. one that consumes a url and produces (scheme, host, path, ...)?

anyhow, given that changes are likely not imminent, "highest" priority is probably overrated.

thanks for your feedback.

comment:3 Changed 8 years ago by wsanchez

Just FYI, #2200 fixes the quoting issue in locateResource(), but I agree that a url parsing routing that it's bound to the request and the request's URI may be useful. What we really need is a URL class that is akin to FilePath.

comment:4 Changed 8 years ago by wsanchez

And, by the way, the test cases is #2200 do try URLs with %20 in the names.

comment:5 Changed 8 years ago by vincentk

#2200 indeed solves my problem. thanks! as such, i guess this ticket can be safely resolved as "duplicate" (?)

i do not think i understand your reference to FilePath, though. it is my understanding that the main objective of FilePath is the safe access of files. on the other hand, URI's map to resources in an in principle unspecified way, so i fail to see what additional functionality one might desire for a URI in this context other than a safe mapping to a FilePath instance.

this to me seems readily accomplished by providing just two (stateless) procedures: one that maps a URI to path segments, and possibly other stuff such as the scheme, etc. . This could likely be a synthesis of what is currently done in _parseURL() and the locateResource() method you provided. A second function could map these path segments to a FilePath instance. this could basically correspond to a slimmed down version of locateResource, accepting a FilePath instance as a starting point (e.g. the site root) and an iterable of path segments and returning the corresponding FilePath instance.

a third (shorthand) might wrap these two up and therefore provide a mapping of URI -> FilePath.

comments?

comment:6 Changed 8 years ago by exarkun

FilePath, or a FilePath-like object tailored for use with URLs (eg, URLPath), is what the procedure mapping an URI to segments should return. Strings are bad data structures, and lists of strings are only marginally better in this case. You really want a structured object with methods to perform common operations so that you don't make mistakes as you reimplement those operations over and over again.

At least, I think that's what wsanchez meant when he mentioned FilePath. :)

comment:7 Changed 8 years ago by wsanchez

  • Resolution set to duplicate
  • Status changed from new to closed

Yeah, that's what I meant. Strings bite us over and over again. That's probably worth it's own ticket. Marking as dup.

comment:8 Changed 8 years ago by vincentk

thank you for this clarification. the motivation of why such a class might be interesting is now clear to me.

on the other hand, i am still quite unclear about what kinds of functionality such a class might implement, other than producing a FilePath object corresponding to the URI path (segments). perhaps a subclass of FilePath that associates a URI with a FilePath object and provides the appropriate operations for parent()'s? this, i seem to remember, is pretty much how zope does it.

i'm just worried that if such an object were to become too heavy-weight to be instantaneously understandable, but still closely resemble FilePath, the situation could become quite confusing. i already find myself spending more time differentiating between URI's, Resource instances, FilePath objects and (finally) the underlying actual files, than i think i should in an ideal world. though i tend to always spend more time on anything, than i think i should ;-)

i agree that strings can be a pain. but so can complex object hierarchies. though perhaps that's just a matter of taste (or inexperience on my part).

hmm. thinking of which, a URI is not a map to a file path, but rather a map to a resource --> a URIPath should definitely return a resource. or a collection thereof, including all parent resources. again, i think this is more or less how zope does it.
a zope object is presented with an environment, which contains a list named URL, which contains all path segments (as strings), and a list named PARENT, which contains all parent resources.

if you think this is worth a ticket, i would be happy to work on it.

cheers,

v.

comment:9 Changed 4 years ago by <automation>

  • Owner jknight deleted
Note: See TracTickets for help on using tickets.