Opened 12 years ago

Closed 8 years ago

#791 defect closed invalid (invalid)

reapAllProcesses occassionally, as apparently signal delivery is unreliable

Reported by: itamarst Owned by:
Priority: high Milestone:
Component: core Keywords:
Cc: itamarst, jknight, spiv, Jean-Paul Calderone, Glyph, Jonathan Lange Branch: branches/sigchld-791-3
branch-diff, diff-cov, branch-cov, buildbot
Author: glyph


Change History (17)

comment:1 Changed 12 years ago by jknight

Has this actually been verified to be true?

comment:2 Changed 10 years ago by Glyph

Component: conch

It doesn't matter; see comments on #1997. We need to do this anyway.

comment:3 Changed 10 years ago by Glyph

Component: conchcore

comment:4 Changed 10 years ago by jknight

Implemented fix in source:branches/sigchld-791/

Technique: a C module which installs a signal handler for SIGCHLD, with SA_RESTART, and a C function to write to a given fd. The reactor listens on that fd, and calls reapAllProcesses when there's activity.

However, if the C module is not present, it instead starts a LoopingCall running whenever there are processes waiting to be reaped, and doesn't install a SIGCHLD handler at all.

comment:5 Changed 10 years ago by jknight

Cc: spiv Jean-Paul Calderone Glyph Jonathan Lange added

Also related to and should fix #1430 #1997

comment:6 Changed 10 years ago by jknight

And also #733

comment:7 Changed 10 years ago by therve

Keywords: review added
Owner: itamarst deleted

It seems it would need review ?

comment:8 Changed 10 years ago by Jean-Paul Calderone

Keywords: review removed

It doesn't have sufficient unit tests.

comment:9 Changed 10 years ago by Tristan Seligmann

Owner: set to jknight

comment:10 Changed 10 years ago by itamarst

The LoopingCall should also be used if someone does, if it's not already. And probably should stop running if there's no Process instances and restart if necessary (so trial is happy), again, if it's not already got that behavior in the branch.

comment:11 Changed 9 years ago by Jean-Paul Calderone

In addition to #733 and #1997, see #2535.

comment:12 Changed 9 years ago by therve

author: therve
Branch: branches/sigchld-791-2

(In [23160]) Branching to 'sigchld-791-2'

comment:13 Changed 9 years ago by therve

I had the opportunity to look at this a bit more. I think the looping call solution is wrong. Do we know exactly what platform doesn't have SA_RESTART? I think I'd prefer the current behavior with SIGCHLD than a polling system that browses every processes launched 10 times per second. This conflicts pretty badly with #2967.

comment:14 Changed 9 years ago by Glyph

Owner: changed from jknight to Glyph
Status: newassigned

comment:15 Changed 9 years ago by Glyph

author: therveglyph
Branch: branches/sigchld-791-2branches/sigchld-791-3

(In [23726]) Branching to 'sigchld-791-3'

comment:16 Changed 8 years ago by Jean-Paul Calderone

Resolution: invalid
Status: assignedclosed

SIGCHLD delivery is not unreliable. What is meant by "unreliable" when people talk about signal delivery is that multiple instances of a particular signal may be coalesced into one call to a signal handler in a process. So polling isn't necessary to deal with "unreliable signal delivery".

However, some of the code in this branch is still probably useful for resolving some Twisted process issues. For example, sigaction SA_RESTART prevents us from breaking os.popen. So some of this code should be used to resolve some tickets.

I'm closing this ticket because the problem it describes does not exist and the solution it suggests is therefore unnecessary. Since there is useful code in the branch, I will re-associate it with the ticket it's most directed at fixing, #733 (the branch conflicts with trunk, so it needed to be merged forward anyway).

comment:17 Changed 6 years ago by <automation>

Owner: Glyph deleted
Note: See TracTickets for help on using tickets.