[Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

Glyph glyph at twistedmatrix.com
Thu Jul 11 01:19:14 MDT 2019


Hi Jarosław!

> On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz <jaroslaw.fedewicz at gmail.com> wrote:
> 
> I have written a simple service which takes data from network, massages it until it's useful enough, and sends the results out periodically via HTTP to an API.

A reasonable start :-).

> It all works for a while, then I get an error like this approximately 40 minutes into the service's uptime:
> 
> ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.ZeroReturnError: >]
> 
> Then a couple more like this:
> 
> ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
> 
> Then it ends with
> 
> TimeoutError: User timeout caused connection failure.
> 
> Then every request results in the same TimeoutError. I don't know if using HTTPS important in this case.

I'm pretty sure the presence of an OpenSSL.SSL error indeed means that HTTPS is important.

> Restarting the whole service, of course, makes the problem go for a while. The other side is the Slack API, so I rather assume it's not very much to blame, it can be demonstrated to work rather reliably, all its criticisms notwithstanding.

It does seem likely that the clustering of errors you're seeing are a local problem with Twisted.

> I cannot yet tell if this bug is a function of uptime, or the number of requests made.

My personal guess is that it has something to do with the number of the TCP connections; or, specifically, the number of pyOpenSSL 'Connection' objects.

> I have tried to work around the problem by discarding the agent object, and using an HTTPConnectionPool with persistent=False, but it didn't help at all. I think it made the problem worse because the framework seems to refer to some objects the Agent creates, and the process becomes a CPU hogs in a couple hours (with the TimeoutErrors still happening all the time).

I have a slight suspicion that the thing that is leaking between connections here is the pyOpenSSL "Context" object.  We recently implemented an optimization which shares the Context object among multiple Connection objects that reference the same host.  What version of Twisted area you using, and what version of OpenSSL, pyOpenSSL, and Cryptography?

I'm curious if you reverse that optimization, if it would make any different to your use-case.

> The closest I've got on the internets which describes a similar problem, apart from people complaining on StackOverflow about precisely this to happen when they are using Scrapy, is this blog post from almost a decade ago: http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/ <http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connection-failure/>. 

This definitely seems like a bug, if it's occurring in multiple places.

> There could be a small chance I'm holding it wrong(tm), but maybe there exists a ticket, just worded differently, which could help me get to the bottom of it.

I don't think that any open tickets describe your precise issue.  So please do open one.  And if possible, can you minimize a proof of concept?  Some example code would go a long way to helping to isolate this.

-glyph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20190711/080ec3f1/attachment-0002.html>


More information about the Twisted-Python mailing list