[Twisted-Python] INCOMPATIBLE CHANGE: twisted.python.threadpool

Glyph glyph at twistedmatrix.com
Fri Sep 26 01:43:22 MDT 2014


On Sep 25, 2014, at 6:36 PM, weykent <weykent at weasyl.com> wrote:

> On Sep 25, 2014, at 3:31 PM, Glyph Lefkowitz <glyph at twistedmatrix.com> wrote:
> 
>> So, does anyone out there have any code which makes use of the aforementioned bad attributes of ThreadPool, whose applications would break if I removed them?
> 
> Yes. Specifically, I am maintaining this AMP responder method:

Wow, cool.

So, okay, while I am a little unhappy that you're using this API in a slightly unfortunate way, that feeling is eclipsed by the joy that The Process Works :-).  Thank you very much for responding so promptly, so specifically, and including the exact code that we need to discuss.

>    @FetchThreadPoolStats.responder
>    def fetchThreadPoolStats(self):
>        pool = self.factory.threadPool
>        return dict(
>            threadsWaiting=len(pool.waiters),
>            threadsWorking=len(pool.working),
>            workerQueueSize=pool.q.qsize(),
>        )
> 
> These statistics are chunked into graphite and displayed as a nice little graph. They’ve been quite useful on some occasions: sometimes all of the threads in a thread pool for a WSGI application got blocked, and the symptoms were that all of the requests coming in were timing out at the proxy in front of twisted. We were able to quickly tell that the problem was the WSGI thread pool because the threadsWorking count was at its limit and the workerQueueSize was skyrocketing.

Analytics, monitoring, logging, and statistics have historically been a blind spot for Twisted and I am really glad you're bringing this up.  I was thinking about bringing it up in my original message, but as my colleagues and friends will tell you, I tend to put too many words into emails, so unless I'm sure I need those words, I will delete them.

The new interface should very definitely have metrics-gathering functionality, and possibly even a logging component.

>> If so, how can I make you feel as bad about yourselves for using it as I feel bad about myself for writing it? ;-)
> 
> I don’t really see this as an abuse of the public interface. If possible, I’d like to see this diagnostic information preserved in the new interface, as I consider it invaluable information as long as we’re using twisted as a WSGI runner.

It's not exactly an "abuse", because the public interface is there, and you have a use-case, and you're satisfying it with what's available.

For the purposes that you are using these attributes, I actually don't have a problem reproducing them at all.  It should be pretty straightforward.  My concern with exposing them overall is that what you've got with "ThreadPool.q" is not a queue that has a qsize() method that you can inspect, it's that it's a Queue that you can put things into, get things out of, and generally muck about with.

Similarly pool.waiters is not simply an object with a __len__ that tells you how active the threads are, it's a list of Thread objects which you could potentially change the names of, turn into daemon threads, join(), and so on.  More importantly it's a list that you could remove Thread objects from and that would affect their interactions with the pool.

How about this: would you mind if ThreadPool were modified to still populate these attributes for monitoring and debugging purposes, but completely stopped honoring any destructive operations on them?  In the case of "pool.q", there is no single Queue responsible for the whole pool, so I would like to make "put", "get", and "join" actually raise exceptions; in the case of "waiters" and "working" I guess I could fish out some actual Thread objects, that just involves a little bit of sad abstraction violation internal to the twisted.threads.Team implementation.

My inclination would be to expand the thread pool to accommodate these usages, but still deprecate these attributes in the next version; but add a better "statistics" method that returned an object with "active", "pending", and "idle" attributes (all integers).

The most important thing that I want to make sure nobody wants is to keep overriding (or calling) startAWorker, stopAWorker, __getstate__, __setstate__.  Supporting callers of threadFactory and currentThread would be easy, but the attributes aren't really there for calling, they're there to be stubbed, and stubbing them won't keep working without tons of extra work.

So would that work for you?

> I will say that it is certainly possible for me to emulate these attributes, but that would require touching implementation details of WSGIResource. I’m not sure which is worse.


Touching private (underscore-prefixed) implementation details is explicitly not supported - if you did it that way, you're on your own :-).  And, as the end-user in this discussion, you win the argument by default.  (You can see how I feel about supporting users here: <https://twitter.com/mjg59/status/514670470260465664>)  Plus, if we're going to work on the thread-pool and have newer, better interfaces, WSGIResource's implementation is likely to change.

Thanks a bunch for helping us improve Twisted,

-glyph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20140926/a22f5a00/attachment-0002.html>


More information about the Twisted-Python mailing list