[Twisted-Python] epoll reactor problems

Thomas Hervé therve at free.fr
Wed Apr 11 05:47:11 MDT 2007


Quoting Alec Matusis <matusis at matusis.com>:

>> That's old (debian stable ? :)). I don't say that'll solve your
>> problem, but you
>> could try with 2.4.4 (warning, not 2.4.3).
>
> It's SuSE stable ;-) Our stuff on that machine is pretty convoluted now, so
> we will probably have a chance to test with 2.4.4 only in a week, when we
> add a brand new server with 2.4.4.

OK. That is just another thing to try, I don't see obvious reasons why 
it could
work better on 2.4.4, but...

> I noticed a difference between this from the 99.9% CPU server:
>
> epoll_wait(4, {{EPOLLERR|EPOLLHUP, {u32=423, u64=12304606485815493031}},
> {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLRDNORM|EPOLLRDBAND|EPOLLWRNORM|EPOLLWRBAND|E
> POLLMSG|EPOLLERR|EPOLLHUP|0x7820, {u32=5529648, u64=5529648}},
> {EPOLLIN|EPOLLPRI|EPOLLRDNORM|EPOLLRDBAND|EPOLLMSG|0x1000, {u32=0,
> u64=22827751178240}}, {0, {u32=0, u64=0}},
> {EPOLLOUT|EPOLLERR|EPOLLONESHOT|EPOLLET|0x3fffa820, {u32=32767,
> u64=18097643565645823}}}, 1432, 68) = 5
>
> and this from a normal server running at 5% CPU:
>
> epoll_wait(4, {{EPOLLIN, {u32=1769, u64=12304606485815494377}}, {0,
> {u32=4294944684, u64=140737488332716}}}, 1728, 17) = 2
>
> What does this mean?

The flags set on your sockets are generally EPOLLIN or EPOLLOUT: data 
to read or
available for write. I don't know much about the other flags. EPOLLERR 
is set if
the fd has been closed for example. EPOLLET is *highly* suspect, because it
should only be there if set in the user code. The documentation of other flags
is really terse...


>> What's the global state of the process? Memory, number of opened fd ?
>
> We immediately reverted to poll, so I do not have it in front of me. The RSS
> size was 45MB, and the number of open fd I do not know: it should have been
> about 1500, but I did not check.

Hum... it may come from running out of file descriptors, so you'd better check
your settings for this.

> I can do another test run with epoll in about 20hrs, since I do not want to
> upset users too much.

Of course :).

> If you have some specific data I should get from the
> test run, please let me know now.

Every information would be useful. The most useful information would be 
to know
when it begins to act strangely, and if there is something that happend 
at this
moment. Otherwise, number of fds, memory, netstat output, strace output...

-- 
Thomas






More information about the Twisted-Python mailing list