[Twisted-Python] Thread Cancelled

Robert DiFalco robert.difalco at gmail.com
Sun Jan 24 20:40:03 MST 2021


Well, I've dealt with this issue with other languages. Not sure how to deal
with it in Klein/Twisted. This operation is idempotent so I suppose what
I'd like to happen is have the whole chain of deferred's succeed but then
just not be able to write the response to the socket -- but not interrupt
the chain. Unfortunately, what I really need is my operations to be atomic
-- introduce a two phase commit or some such. But that's too big of a job
on this legacy code base for now. Right now I'd be content if I could save
the database record and then send the SQS message to AWS and not have that
SQS send interrupted if the client closed the socket.. Is there a simple
way to achieve that? If I got the request body I'm good, I don't care if I
can't write the response.

One other thing that would be nice is to know why a deferred was canceled.
If the client close a connection I might like to ignore the cancel, but I
think what is happening is that Twisted is pretty smart. So it either knows
I'm going to make a write to SQS using Boto that is inbound, so it is
somehow able to cancel that operation -- perhaps more likely the client
closed the connection and the connection canceled defers after I called
deferToThread but before a thread was available to run on. Either way, I'd
like them all to run, and just fail to write the final response from the
end of the chain.

Sorry, too many words.




On Sun, Jan 24, 2021 at 3:22 PM Glyph <glyph at twistedmatrix.com> wrote:

> If you're dealing with lots of clients on the public internet, sometimes
> this is just gonna happen, for a variety of reasons; it's normal.  We would
> welcome better error reporting for this scenario so it doesn't require the
> kind of debugging you just did :-).
>
> -g
>
> On Jan 24, 2021, at 2:58 PM, Robert DiFalco <robert.difalco at gmail.com>
> wrote:
>
> That makes sense, thank you. A timeout seems unlikely but maybe the client
> is closing the connection due to a network issue. This is an extremely rare
> occurrence.
>
> On Sun, Jan 24, 2021 at 2:41 PM Glyph <glyph at twistedmatrix.com> wrote:
>
>> While a socket is open and receiving data, recv() will either give you a
>> non-zero number of bytes if bytes are ready, or an EWOULDBLOCK (AKA EAGAIN)
>> if no bytes are ready.  A result of zero bytes (the empty string) means
>> "end of file" - the other end has closed the socket.
>>
>> So what's happening here is your client is timing out or otherwise
>> canceling its request by closing the socket, and this is the correct,
>> intentional response to that scenario.
>>
>> -g
>>
>> On Jan 24, 2021, at 11:57 AM, Robert DiFalco <robert.difalco at gmail.com>
>> wrote:
>>
>> You're absolutely right, I meant "cancel the deferred". I don't grok
>> server sockets very well so maybe someone can help. But apparently, klein
>> does a .doRead from our server socket (getting the request from the
>> client?). This returns a "why" of "connection done" so that closes the
>> connection before we have written our response to the client, and that
>> cancels the deferred SQS write.
>>
>>
>> https://github.com/racker/python-twisted-core/blob/master/twisted/internet/selectreactor.py#L148-L155
>>
>> The method above is "doRead". Which calls this:
>>
>> https://github.com/twisted/twisted/blob/trunk/src/twisted/internet/tcp.py#L239
>>
>> I guess if If socket.rcv() returns an empty string it simply closes the
>> connection.
>>
>> https://github.com/twisted/twisted/blob/trunk/src/twisted/internet/tcp.py#L249-L250
>>
>> Is that normal? I mean I guess it must be but then why is the read
>> getting an empty string and closing the connection? I can't really account
>> for it? Some kind of back pressure due to load?
>>
>> Thanks for any thoughts.
>>
>>
>>
>> On Sun, Jan 24, 2021 at 11:32 AM Colin Dunklau <colin.dunklau at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Sun, Jan 24, 2021 at 11:45 AM Robert DiFalco <
>>> robert.difalco at gmail.com> wrote:
>>>
>>>> Hi, I apologize this question is a little vague. I'm looking for
>>>> pointers. I have a klein route that makes an underlying deferToThread call
>>>> with a simple single thread (an IO based sync call I can't change, a boto3
>>>> sqs write). The thread pool is simple, just a couple of threads, nothing
>>>> fancy.
>>>>
>>>> VERY rarely it appears that Klein cancels the thread. What techniques
>>>> can I use to figure out why my thread is being Canceled? There's nothing in
>>>> the failure to tell me "who, why, or where" it was canceled. Also, I cannot
>>>> get this down to a reproducible case, but here's the boto3 sqs wrapper,
>>>> this fall back works fine, but it's a band-aide for an error I can't track
>>>> down.:
>>>>
>>>> def write(self, payload):
>>>>     """
>>>>     Write message to SQS async from thread pool. If twisted cancels the
>>>>     thread, instead write synchronously.
>>>>
>>>>     def _retrySynchronously(error):
>>>>         if error.type != CancelledError:
>>>>             return error
>>>>
>>>>         log.warn("Async SQS write cancelled. Calling synchronously.")
>>>>         return defer.succeed(self._writeSyncFallback(payload))
>>>>
>>>>     deferredCall = self._deferToThread(self.sqs.write, payload)
>>>>     deferredCall.addErrback(_retrySynchronously)
>>>>     return deferredCall
>>>>
>>>> def _writeSyncFallback(self, payload):
>>>>     return self.sqs.write(payload)
>>>>
>>>> The _deferToThread call just uses my own thread pool with 2 threads,
>>>> but is otherwise stock.
>>>>
>>>> Is there a level of logging I'm missing or some other thing that would
>>>> tell me why the thread is being canceled? The retry works great and Klein
>>>> does not return an error from the route.
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>> I think we'll need to see more code for this, specifically the caller of
>>> that `write` method, and its callers, etc. Note that the thread itself
>>> isn't being cancelled, the Deferred you get from _deferToThread is... so
>>> you'll most likely need to find out what code interacts with that object to
>>> progress in isolating this.
>>>
>>> In my quick skim of the deferToThread and ThreadPool source, I can't
>>> find any explicit cancellations. While that certainly doesn't rule it out,
>>> it does make me think you're more likely to find the issue by inspecting
>>> the callers involved.
>>> _______________________________________________
>>> Twisted-Python mailing list
>>> Twisted-Python at twistedmatrix.com
>>> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>>>
>> _______________________________________________
>> Twisted-Python mailing list
>> Twisted-Python at twistedmatrix.com
>> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>>
>>
>> _______________________________________________
>> Twisted-Python mailing list
>> Twisted-Python at twistedmatrix.com
>> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20210124/e3606c1e/attachment.htm>


More information about the Twisted-Python mailing list