Opened 4 years ago

Last modified 4 years ago

#4703 enhancement new

twisted.web.client.Agent.request 'method' argument should default to 'GET', somehow

Reported by: glyph Owned by:
Priority: normal Milestone:
Component: web Keywords:
Cc: itamar@… Branch:
Author: Launchpad Bug:

Description

getPage(url) implies the GET method. Agent.request should do the same. It is tedious and error-prone to have to type out 'GET' over and over again if you want to fetch a bunch of URLs, especially in examples.

(Generally speaking, Agent should strive to be as convenient as possible when compared with getPage, and should not give developers any reasons to cleave to the old and crappy way of doing web client stuff.)

Change History (8)

comment:1 Changed 4 years ago by glyph

This may be mistitled; it would be equally good (better, maybe) if we just had a get() convenience method, rather than mucking with the signature of request().

comment:2 follow-up: Changed 4 years ago by itamar

  • Cc itamar@… added

The current API is supposed to be low level and general. We should be building higher level wrappers, and that is probably where such conveniences should reside.

comment:3 in reply to: ↑ 2 Changed 4 years ago by glyph

Replying to itamar:

The current API is supposed to be low level and general.

"low level" does not necessarily imply "bad and inconvenient defaults". For example, why do we allow the user not to pass bodyProducer? It only makes sense to have a default (empty) body in the case of GET, HEAD, and maybe DELETE methods. If you're doing a PUT or a POST, a body is implicitly required. The reason I suggested get() is that we probably shouldn't change request() for compatibility reasons, not that there's a whole discrete layer.

We should be building higher level wrappers, and that is probably where such conveniences should reside.

A higher-level wrapper might be the thing that transformed the Response object into a str for you, or something like that. I'm talking really talking about a convenience in the same level of the API, just correcting for the fact that different methods have different reasonable defaults. Although such a wrapper would be OK, I suppose, if it could somehow implicitly create the Agent object for you. (What I'd like to try to avoid here is needing to import N different names from N different modules and then assemble your own 'actually pleasant to use' API object out of various bits of detritus; it's hard to document that, and it's considerably more tedious than from twisted.web.client import getPage; getPage().)

comment:4 follow-up: Changed 4 years ago by exarkun

A motivation to keep the Agent interface small is that there should eventually be quite a few implementations of it (I have a couple outside of Twisted already).

Since, as far as I can tell, a set of get(), post(), etc methods can all be implemented purely in terms of request(), it would be nice to implement them once and have them be re-usable with any implementation of the Agent interface. Of course, they could be a mixin, but subclassing is bad.

The API that will replace getPage should probably be either:

from twisted.web.client import theClient
theClient.get(...)

Or

from twisted.internet import reactor
from twisted.web.client import niceClientFactory
aClient = niceClientFactory(reactor)
aClient.get(...)

Clearly the former is less typing, but also implies some global mutable state (that client is going track cookies, keep a connection pool, etc). The latter is really implied by the former, but makes the application create the instance instead of providing a global instance for everyone to share.

niceClientFactory should assemble whatever combination of objects people will typically want (cookies, connection pools, authentication, proxying based on de facto standard environment variables, etc) and can also wrap the result with something providing get() et al.

comment:5 Changed 4 years ago by exarkun

sorry about the formatting fail

comment:6 in reply to: ↑ 4 Changed 4 years ago by glyph

Replying to exarkun:

A motivation to keep the Agent interface small is that there should eventually be quite a few implementations of it (I have a couple outside of Twisted already).

OK, I can see that. Maybe new methods aren't the way to go. I was just thinking those would be easier for compatibility, but we don't have a good compatibility story for interfaces that 3rd-party code is supposed to implement. (I will note, however, that Agent isn't actually an interface! Should it be one?) Here's a random smattering of issues with the current interface which are sort of related to this one:

  • The URL should be the first parameter, not the method. It's really the main event, and this would also let the method default to 'GET' normally.
  • The URL parameter should be a structured object, not a string. If it's a string, it should be parsed or adapted to the structured object for internal use. Right now every layer has to parse it and then put it back together as a string if they want to delegate it.
  • The 'method' parameter should be an object. Perhaps something a-la #4671. Right now it's still possible to typo 'HeAD' and send out some potentially corrupt data on the wire, in a sort of mutant mirror of #446.
  • With the default parameters and everybody-has-to-parse-the-URI-themselves structure that we already have, it's kind of hard to implement delegation correctly. (This is excessively nitpicky, but the current API documentation doesn't call out None as the default value: it just makes reference to 'if not passed...'.) The implementation even says, in a comment, "This is a lot of copying". But I can see that HTTP11ClientProtocol has a 1-argument version which just takes a (private) Request object. Passing that along and mutating it at each stage, to add headers or what-have-you (or wrapping it in something interface-compatible to accomplish the same) would be a much easier approach to delegation. Would making that public be a good idea? (Since it isn't public yet, we could re-order its constructor to be in the order that I like, or add a 'fromURI' method.)

If you think any of these should be fixed, how should we evolve this interface without disturbing third parties who might be starting to use it? Should we target all of this for a different layer?

Since, as far as I can tell, a set of get(), post(), etc methods can all be implemented purely in terms of request(), it would be nice to implement them once and have them be re-usable with any implementation of the Agent interface. Of course, they could be a mixin, but subclassing is bad.

True. We should invent something that's like subclassing, but good.

The API that will replace getPage should probably be either: (...) Or: (...)

Clearly the former is less typing, but also implies some global mutable state (that client is going track cookies, keep a connection pool, etc). The latter is really implied by the former, but makes the application create the instance instead of providing a global instance for everyone to share.

Global mutable state sucks or whatever, but if we advise libraries to each create their own new niceClientFactory, then each library will have its own view of cookies, authentication, etc, and it strikes me that a major point of point of persisting that information would be to be able to conveniently share it. The phrase 'dependency injection' is floating around the back of my mind like some desperate phantasm, screaming to get out, now. But, all the other solutions I know of are just like global variables but way more complex: ZCML, anyone?

niceClientFactory should assemble whatever combination of objects people will typically want (cookies, connection pools, authentication, proxying based on de facto standard environment variables, etc) and can also wrap the result with something providing get() et al.

+1. (Except then we should probably have some kind of easy-to-access test fixture, too.)

Sorry for the somewhat meandering turn that this ticket discussion has taken, but I think it's good to have this written down somewhere.

comment:7 Changed 4 years ago by <automation>

  • Owner exarkun deleted

comment:8 Changed 4 years ago by lewq

Might be somewhat helpful to have this simple and mostly unfinished code as a starting point for the post() implementation: https://gist.github.com/846545

Note: See TracTickets for help on using tickets.