[Twisted-Python] Need some enlightenment on using web client properly, or maybe nudge a bug to get fixed

Scott, Barry barry.scott at forcepoint.com
Thu Jul 11 10:39:48 MDT 2019


On Thursday, 11 July 2019 11:00:33 BST Jarosław Fedewicz wrote:
> So far, I tried to minimize a test case, but it seems like it's really
> picky about what environment it's running in. One of those cases where "it
> works on my machine", I suppose. The versions are as follows:
> 
> cryptography==2.7
> pyOpenSSL==19.0.0
> asn1crypto==0.24.0
> pyasn1==0.4.5
> pyasn1-modules==0.2.5
> Twisted==19.2.1
> 
> The target machine is running Xenial, so openssl 1.0.0g.

That's old... Can you go to 1.0.2s?
I recall that pyOpenSSL may need newer openssl - might be wrong on this.

> My local machine runs Fedora 30, thus openssl 1.1.1c.
> 
> Is there a neat way to list all pyOpenSSL objects in a running Twisted
> program? Or maybe TCPConnection objects, since those might hook to the
> zope.interface machinery?

You can use the gc to help with this sort of debugging.

gc.collect()
for obj in gc.get_objects():
     do something interesting with obj

You could count the number of each type of obj and look for which ones 
increase over time.

Barry



> 
> On Thu, Jul 11, 2019 at 9:20 AM Glyph <glyph at twistedmatrix.com> wrote:
> > Hi Jarosław!
> > 
> > On Jul 1, 2019, at 4:48 PM, Jarosław Fedewicz
> > <jaroslaw.fedewicz at gmail.com>
> > wrote:
> > 
> > I have written a simple service which takes data from network, massages it
> > until it's useful enough, and sends the results out periodically via HTTP
> > to an API.
> > 
> > 
> > A reasonable start :-).
> > 
> > It all works for a while, then I get an error like this approximately 40
> > minutes into the service's uptime:
> > 
> > ResponseNeverReceived: [<twisted.python.failure.Failure
> > OpenSSL.SSL.ZeroReturnError: >]
> > 
> > 
> > Then a couple more like this:
> > 
> > ResponseNeverReceived: [<twisted.python.failure.Failure
> > twisted.internet.error.ConnectionLost: Connection to the other side was
> > lost in a non-clean fashion: Connection lost.>]
> > 
> > 
> > Then it ends with
> > 
> > TimeoutError: User timeout caused connection failure.
> > 
> > 
> > Then every request results in the same TimeoutError. I don't know if using
> > HTTPS important in this case.
> > 
> > 
> > I'm pretty sure the presence of an OpenSSL.SSL error indeed means that
> > HTTPS is important.
> > 
> > Restarting the whole service, of course, makes the problem go for a while.
> > The other side is the Slack API, so I rather assume it's not very much to
> > blame, it can be demonstrated to work rather reliably, all its criticisms
> > notwithstanding.
> > 
> > 
> > It does seem likely that the clustering of errors you're seeing are a
> > local problem with Twisted.
> > 
> > I cannot yet tell if this bug is a function of uptime, or the number of
> > requests made.
> > 
> > 
> > My personal guess is that it has something to do with the number of the
> > TCP connections; or, specifically, the number of pyOpenSSL 'Connection'
> > objects.
> > 
> > I have tried to work around the problem by discarding the agent object,
> > and using an HTTPConnectionPool with persistent=False, but it didn't help
> > at all. I think it made the problem worse because the framework seems to
> > refer to some objects the Agent creates, and the process becomes a CPU
> > hogs
> > in a couple hours (with the TimeoutErrors still happening all the time).
> > 
> > 
> > I have a slight suspicion that the thing that is leaking between
> > connections here is the pyOpenSSL "Context" object.  We recently
> > implemented an optimization which shares the Context object among multiple
> > Connection objects that reference the same host.  What version of Twisted
> > area you using, and what version of OpenSSL, pyOpenSSL, and Cryptography?
> > 
> > I'm curious if you reverse that optimization, if it would make any
> > different to your use-case.
> > 
> > The closest I've got on the internets which describes a similar problem,
> > apart from people complaining on StackOverflow about precisely this to
> > happen when they are using Scrapy, is this blog post from almost a decade
> > ago:
> > http://www.chris-wong.net/twisted-web-framework-user-timeout-caused-connec
> > tion-failure/ .
> > 
> > 
> > This definitely seems like a bug, if it's occurring in multiple places.
> > 
> > There could be a small chance I'm holding it wrong(tm), but maybe there
> > exists a ticket, just worded differently, which could help me get to the
> > bottom of it.
> > 
> > 
> > I don't think that any open tickets describe your precise issue.  So
> > please do open one.  And if possible, can you minimize a proof of concept?
> > Some example code would go a long way to helping to isolate this.
> > 
> > -glyph
> > _______________________________________________
> > Twisted-Python mailing list
> > Twisted-Python at twistedmatrix.com
> > https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python






More information about the Twisted-Python mailing list