Opened 3 years ago

Last modified 16 months ago

#5405 enhancement new

Update twisted.web.client.getPage to use Agent and friends.

Reported by: amberite Owned by:
Priority: low Milestone:
Component: web Keywords:
Cc: jknight, ldanielburr@…, dan.bornside@… Branch:
Author: Launchpad Bug:

Description

Now that twisted.web.client.Agent supports redirection and cookies, porting the getPage functionality to run atop Agent would seem to be a win in terms of reducing the amount of code to be maintained.

Attachments (3)

agentGetPageSketch.py (2.8 KB) - added by amberite 3 years ago.
A crappy sketch of how to implement getPage in terms of Agent
agentGetPageSketch.2.py (2.4 KB) - added by amberite 3 years ago.
Updated sketch using FileBodyProducer
compatAgentGetPage_v1.patch (3.9 KB) - added by dbornside 16 months ago.
get page implementation based on Agent

Download all attachments as: .zip

Change History (14)

comment:1 Changed 3 years ago by DefaultCC Plugin

  • Cc jknight added

Changed 3 years ago by amberite

A crappy sketch of how to implement getPage in terms of Agent

comment:2 Changed 3 years ago by exarkun

We have FileBodyProducer now. FileBodyProducer(StringIO(formdata)) can replace RequestBodyProducer(formdata).

Changed 3 years ago by amberite

Updated sketch using FileBodyProducer

comment:3 Changed 3 years ago by amberite

Still crappy, but a little closer.

comment:4 Changed 3 years ago by amberite

  • Cc ldanielburr@… added

comment:5 Changed 3 years ago by exarkun

Awesome, amberite. :) Thanks for pushing this on. Have you tried dropping this getPage into twisted.web.client and discovering what the unit tests think of it?

comment:6 Changed 3 years ago by amberite

I've progressed this to the point where it passes the tests up until WebClientSSLTestCase, where it starts to throw some errors, and then hangs outright on test_getPageBrokenDownload.

I'll attach another stab at this once I've got some more tests passing, or if I can't figure out why the tests fail.

comment:7 Changed 3 years ago by cyli

I'm not sure if this is would comply with the twisted coding standards, but it'd be nice if the new getPage could take an IBodyProducer as well as just postdata.

I know that the current docstring says to look at the arguments from HTTPClientFactory, and I guess compatibility needs to be maintained, so a postdata argument needs to be there, but it'd be convenient if, for instance, I wanted to upload a file for me to just provide the FileBodyProducer for that file, rather than read the whole thing into memory and then pass it as postdata, which then creates a file like object to pass to FileBodyProducer anyway.

Sorry to be a pain. :) Just seems like it might be a convenience.

comment:8 Changed 3 years ago by exarkun

The getPage API has a lot of problems. Agent is the future. We should definitely have an API that makes it easy to post forms and easy to upload files, though - but I'd lean away from pushing that into getPage and think more about something new.

comment:9 Changed 3 years ago by amberite

Yeah, I believe there was some talk on IRC a while back about designing a better API atop Agent, and doing the minimum necessary to move getPage/downloadPage atop Agent. That said, if there is some powerful desire to add backwards-compatible improvements to getPage/downloadPage, I'm willing to work on them after I actually get the basics working.

comment:10 Changed 3 years ago by glyph

See also: Agent should probably process relative URI references when it deals with a redirect.

Changed 16 months ago by dbornside

get page implementation based on Agent

comment:11 Changed 16 months ago by dbornside

  • Cc dan.bornside@… added

This patch is a pretty rough cut of a t.w.c.getPage implemntation; it still fails on some tests; the ones relating to redirects and timeouts. Also missing is a related change for downloadPage;

Note: See TracTickets for help on using tickets.