Opened 10 years ago

Closed 12 months ago

#823 enhancement closed fixed (fixed)

twistd should wait for the application to start when daemonizing

Reported by: spiv Owned by: therve
Priority: normal Milestone:
Component: core Keywords: twistd
Cc: exarkun, spiv, therve, zooko@… Branch: branches/twistd-wait-823-5
(diff, github, buildbot, log)
Author: exarkun, therve Launchpad Bug:

Description (last modified by therve)

Currently, twistd can start successfully (exit with code 0) even if the application fails when it daemonizes. It'd be good if it waited for the application to start, returning the appropriate exit code.

Change History (39)

comment:1 Changed 10 years ago by spiv

It would be nice if twistd had features like:
   -w --wait   twistd will wait until all services in the app configuration have
               started before exiting.  --nodaemon conflicts with this option.
   -k --kill <pidfile>  stop a twisted app specified by <pidfile>

--wait would be useful for scripts (e.g. functional tests) that want to spawn a
twisted app, wait for it to be listening and ready, and then proceed.  --kill
would be convenient for tearing down those tests, and probably in other places.

--wait could be implemented by opening a pipe to the child before forking, and
getting the child application to write to the pipe (or close it?) to signal when
it has started all its services.

--kill could be implemented on POSIX rather stupidly by doing:
    while True:
        try:
            os.kill(pid, 0)
        except OSError:
            break

--wait is really the one I want, though.

comment:2 Changed 7 years ago by exarkun

  • Component set to core
  • Keywords core removed

Shouldn't --wait just be the default? I don't ever want to not know when the twistd command I ran didn't actually do what I wanted it to do.

comment:3 Changed 7 years ago by therve

  • Cc therve added

comment:4 Changed 6 years ago by therve

  • Owner changed from spiv to therve

comment:5 Changed 6 years ago by therve

  • author set to therve
  • Branch set to branches/twistd-wait-823

(In [22480]) Branching to 'twistd-wait-823'

comment:6 Changed 6 years ago by therve

(In [22481]) Throw this hack somewhere: it waits for the start of a daemonized twistd process,
and report its status.

Refs #823

comment:7 Changed 6 years ago by therve

  • Branch changed from branches/twistd-wait-823 to branches/twistd-wait-823-2

(In [23470]) Branching to 'twistd-wait-823-2'

comment:8 Changed 6 years ago by therve

(In [23471]) Reimplement the wait slightly differently

Refs #823

comment:9 Changed 4 years ago by zooko

http://tahoe-lafs.org/trac/tahoe-lafs/ticket/71 is a ticket for the Tahoe-LAFS project which I think would be fixed by using the --wait option if twistd supports that option.

comment:10 Changed 4 years ago by zooko

  • Cc zooko@… added

comment:11 Changed 4 years ago by exarkun

  • Keywords twistd added

comment:12 Changed 4 years ago by zooko

  • Cc zooko@… removed
  • Summary changed from Make twistd a little more scripting friendly to add --wait and --kill to twistd (Make twistd a little more scripting friendly)

comment:13 Changed 4 years ago by zooko

  • Cc zooko@… added

comment:14 Changed 4 years ago by zooko

There is discussion in the Tahoe-LAFS project about whether to adapt therve's idea into Tahoe-LAFS source code base and/or to contribute such patches to this ticket: http://tahoe-lafs.org/trac/tahoe-lafs/ticket/71#comment:24

comment:15 Changed 4 years ago by zooko

  • Keywords tests-needed added

jml suggested on IRC that this ticket might be ready for review. I looked at the patch and it is missing tests. The patch adds the functionality of having the parent wait until the child process has started and has signalled success or failure to the parent through a pipe. To test that added functionality, we might extend test_daemonize().

The tests could check whether the parent correctly exits with 0 when the child's startApplication() returns successfully, and whether the parent correctly exits with non-zero when the child's startApplication() raises an exception. I think that would be sufficient testing for this patch.

In order to do this the new test_daemonize() would have to not patch out the daemonization (as currently done in UnixApplicationRunnerSetupEnvironmentTests.setUp(). Instead it could probably patch out os._exit().

I think that would be sufficient testing for this patch.

comment:16 Changed 4 years ago by exarkun

  • Author changed from therve to exarkun, therve
  • Branch changed from branches/twistd-wait-823-2 to branches/twistd-wait-823-3

(In [29952]) Branching to 'twistd-wait-823-3'

comment:17 Changed 3 years ago by <automation>

  • Owner therve deleted

comment:18 Changed 17 months ago by therve

  • Branch changed from branches/twistd-wait-823-3 to branches/twistd-wait-823-4

(In [37525]) Branching to 'twistd-wait-823-4'

comment:19 Changed 17 months ago by therve

  • Owner set to therve

comment:20 Changed 17 months ago by therve

  • Keywords review added; tests-needed removed
  • Owner therve deleted

Let's do this!

comment:21 Changed 17 months ago by jerub

  • Keywords review removed
  • Owner set to therve

Some small comments:

The soup nazi seinfeld sketch is no longer funny. Please update this message to be something like "Daemonization tests are only run on unix platforms when twisted.scripts._twistd_unix is available"

if _twistd_unix is None:
    DaemonizeTests.skip = "No daemonize for you"

The alignment in protocols/test/test_basic.py is wrong for these lines:

                          [b'produce', b'hello world', b'unproduce',
                            b'goodbye'])

Please fix.

I really don't like flags with names that start with 'no', and the not optionsnowait? parameter illustrates why. Can you think of a better name for this option so that we don't have to employ double negatives?

comment:22 follow-up: Changed 17 months ago by jerub

Okay, I did a poor review: the code I looked at was a bad diff.

Test run is available here.

comment:23 follow-up: Changed 16 months ago by therve

  • Keywords review added
  • Owner therve deleted

OK, back to review. Unfortunately I haven't found a better name than no-wait, still opened to suggestions though.

comment:24 in reply to: ↑ 22 Changed 15 months ago by glyph

Replying to jerub:

Test run is available here.

FYI: "here" isn't a link.

comment:25 in reply to: ↑ 23 Changed 15 months ago by glyph

Replying to therve:

OK, back to review. Unfortunately I haven't found a better name than no-wait, still opened to suggestions though.

--rush? --hurry? :-)

comment:27 Changed 14 months ago by tom.prince

If I were to use this in an application, I'd ideally like

  1. any logs generated while starting up.
  2. an application hook to indicate when startup is complete. Since .startService isn't asynchronous, often an application won't be ready by the time it returns.

I've got some hackish code for buildbot that does it here.

comment:28 follow-up: Changed 14 months ago by radix

  • Keywords review removed

Questions:

  1. Can you explain why --no-wait is necessary? I'd rather not have it unless there's a really compelling reason.
  2. Why limit the message to 100 bytes?

Code changes:

  1. You removed the 'os' parameter from daemonize, but the docstring still documents it.
  2. There are some conflicts with trunk

Discussion:

  1. I'm not super into monkeypatch-the-import; it's pretty sensitive to changes. But I guess we already do it elsewhere.
  2. the move of the daemonize method was hard to review.
  3. I notice that this branch doesn't implement --kill, but it's clear to me that --kill should be done in a separate ticket/branch, so that's cool.

Everything else about the change seems pretty reasonable, but I'm a little worried that there may be weird edge cases that this would break for. Oh well!

To address tom's point:

  1. I think logs would be pretty cool, but a notification that the user should look at the log file is pretty good for now.

comment:29 Changed 14 months ago by therve

  • Owner set to therve

comment:30 Changed 14 months ago by therve

  • Description modified (diff)
  • Summary changed from add --wait and --kill to twistd (Make twistd a little more scripting friendly) to twistd should wait for the application to start when daemonizing

Tom: your second point is interesting, but really unrelated to anything happening in that ticket. It's already 9 years old, let's not add random feature requests.

comment:31 Changed 14 months ago by therve

  • Branch changed from branches/twistd-wait-823-4 to branches/twistd-wait-823-5

(In [38885]) Branching to 'twistd-wait-823-5'

comment:32 in reply to: ↑ 28 ; follow-up: Changed 14 months ago by therve

  • Keywords review added
  • Owner therve deleted

Replying to radix:

Questions:

  1. Can you explain why --no-wait is necessary? I'd rather not have it unless there's a really compelling reason.

The idea was that people may want to fall back to previous behavior. Wait is the new default behavior because it's the one that makes most sense, but maybe people care about firing without caring?

  1. Why limit the message to 100 bytes?

We need a limit, I picked an arbitrary number.

Code changes:

  1. You removed the 'os' parameter from daemonize, but the docstring still documents it.
  2. There are some conflicts with trunk

Fixed.

Discussion:

  1. I'm not super into monkeypatch-the-import; it's pretty sensitive to changes. But I guess we already do it elsewhere.

Yeah it's not super great.

  1. the move of the daemonize method was hard to review.

Sorry, http://paste.pound-python.org/show/34194/ FWIW.

  1. I notice that this branch doesn't implement --kill, but it's clear to me that --kill should be done in a separate ticket/branch, so that's cool.

Right we can open a bug for that.

Everything else about the change seems pretty reasonable, but I'm a little worried that there may be weird edge cases that this would break for. Oh well!

Maybe! As seen on the previous diff, the actual changes are fairly minimal though.

Thanks!

comment:33 in reply to: ↑ 32 ; follow-up: Changed 14 months ago by glyph

Replying to therve:

Replying to radix:

Questions:

  1. Can you explain why --no-wait is necessary? I'd rather not have it unless there's a really compelling reason.

The idea was that people may want to fall back to previous behavior. Wait is the new default behavior because it's the one that makes most sense, but maybe people care about firing without caring?

Can't the user just do this with "nohup twistd ... &" on UNIXish platforms (and by switching to a UNIXish platform if they're on Windows? :)).

  1. I notice that this branch doesn't implement --kill, but it's clear to me that --kill should be done in a separate ticket/branch, so that's cool.

Right we can open a bug for that.

Please do. And link to it ;).

Everything else about the change seems pretty reasonable, but I'm a little worried that there may be weird edge cases that this would break for. Oh well!

Maybe! As seen on the previous diff, the actual changes are fairly minimal though.

That's what the pre-release period is for, right? :)

comment:34 in reply to: ↑ 33 ; follow-up: Changed 13 months ago by therve

Replying to glyph:

Replying to therve:

Replying to radix:

Questions:

  1. Can you explain why --no-wait is necessary? I'd rather not have it unless there's a really compelling reason.

The idea was that people may want to fall back to previous behavior. Wait is the new default behavior because it's the one that makes most sense, but maybe people care about firing without caring?

Can't the user just do this with "nohup twistd ... &" on UNIXish platforms (and by switching to a UNIXish platform if they're on Windows? :)).

They sure can. Windows is irrelevant here because it's a daemonization feature. Should I remove the argument then?

  1. I notice that this branch doesn't implement --kill, but it's clear to me that --kill should be done in a separate ticket/branch, so that's cool.

Right we can open a bug for that.

Please do. And link to it ;).

#6593.

comment:35 in reply to: ↑ 34 Changed 13 months ago by glyph

Replying to therve:

They sure can. Windows is irrelevant here because it's a daemonization feature. Should I remove the argument then?

I honestly can't think of a good reason to keep it, and all things being equal, we should not have more stuff. Please remove.

#6593.

Thanks!

comment:36 Changed 13 months ago by therve

  • Keywords review removed
  • Owner set to therve

comment:37 Changed 13 months ago by therve

  • Keywords review added
  • Owner therve deleted

Back for review! I removed the flag, and also added some untilConcludes calls for safety.

comment:38 Changed 12 months ago by rwall

  • Keywords review removed
  • Owner set to therve

Code Review

Thanks Therve, this looks like a great improvement.

Notes:

  • Merges cleanly
  • trial twisted.test passes on Fedora 19
  • Attempted to run twistd dns on privilieged port which resulted in the expected error. (although I wonder if the stale pid message should be in the log file rather than stderr?)
    [richard@zorin twistd-wait-823-5]$ ./bin/twistd dns
    Removing stale pidfile /home/richard/projects/Twisted/branches/twistd-wait-823-5/twistd.pid
    An error has occurred: 'Couldn't listen on any:53: [Errno 13] Permission denied.'
    Please look at log file for more information.
    

Some points:

  1. Build failures: http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/twistd-wait-823-5
    1. winxp32-py2.7: https://buildbot.twistedmatrix.com/builders/winxp32-py2.7/builds/2558/steps/select/logs/problems Could this be caused by your changes to test_process? Or is it just a routine windows failure?
      [ERROR]
      Traceback (most recent call last):
        File "C:\buildslave\twisted\winxp32-py2.7\Twisted\twisted\internet\base.py", line 824, in runUntilCurrent
          call.func(*call.args, **call.kw)
        File "C:\buildslave\twisted\winxp32-py2.7\Twisted\twisted\internet\_pollingfile.py", line 79, in _pollEvent
          workUnits += resource.checkWork()
        File "C:\buildslave\twisted\winxp32-py2.7\Twisted\twisted\internet\_pollingfile.py", line 136, in checkWork
          self.cleanup()
        File "C:\buildslave\twisted\winxp32-py2.7\Twisted\twisted\internet\_pollingfile.py", line 141, in cleanup
          self.lostCallback()
        File "C:\buildslave\twisted\winxp32-py2.7\Twisted\twisted\internet\_dumbwin32proc.py", line 336, in outConnectionLost
          self.proto.childConnectionLost(1)
        File "C:\buildslave\twisted\winxp32-py2.7\Twisted\twisted\test\test_process.py", line 204, in childConnectionLost
          "Data was %r instead of 'abcd'" % (self.data,))
      exceptions.RuntimeError: Data was '' instead of 'abcd'
      
      twisted.test.test_process.ProcessTestCase.testManyProcesses
      
    2. twistedchecker: https://buildbot.twistedmatrix.com/builders/twistedchecker/builds/1043/steps/run-twistedchecker/logs/new%20twistedchecker%20errors
      ************* Module twisted.scripts._twistd_unix
      W9604:308,12:UnixApplicationRunner.daemonize: Please use 'except Exception as e:',rather than 'except Exception, e:'
      ************* Module twisted.test.test_process
      W9202:1382,4:MockOS.read: Missing epytext markup @param for argument "fd"
      W9202:1382,4:MockOS.read: Missing epytext markup @param for argument "size"
      W9204:1382,4:MockOS.read: Missing epytext markup @return for return value
      W9202:1516,4:MockOS.chdir: Missing epytext markup @param for argument "path"
      W9202:1582,4:MockOS.unlink: Missing epytext markup @param for argument "filename"
      W9202:1589,4:MockOS.umask: Missing epytext markup @param for argument "mask"
      W9204:1596,4:MockOS.getpid: Missing epytext markup @return for return value
      C0301:1760,0: Line too long (86/79)
      
    3. pyflakes: https://buildbot.twistedmatrix.com/builders/pyflakes/builds/754/steps/pyflakes/logs/new%20pyflakes%20errors
      twisted/test/test_twistd.py:19: redefinition of unused 'grp' from line 17
      twisted/test/test_twistd.py:19: redefinition of unused 'pwd' from line 16
      twisted/test/test_twistd.py:24: redefinition of unused 'pickle' from line 22
      twisted/test/test_twistd.py:46: redefinition of unused '_twistd_unix' from line 44
      twisted/test/test_twistd.py:55: redefinition of unused 'syslog' from line 53
      twisted/test/test_twistd.py:61: redefinition of unused 'profile' from line 59
      twisted/test/test_twistd.py:70: redefinition of unused 'hotshot' from line 65
      twisted/test/test_twistd.py:76: redefinition of unused 'cProfile' from line 74
      
    4. coverage: Some parts of daemonize aren't covered. They never were, but you may want to improve things.
  1. source:branches/twistd-wait-823-5/twisted/scripts/_twistd_unix.py
    1. "code = 0": That assigment looks redundant
      298              if os.fork():  # launch child and...
              299                  code = 0
              300                  code = self._waitForStart(r)
      
  1. "Please look at log file for more information": It would be nice to print the log file path here (self.configlogfile?) but only if it's easy.
  1. I wondered why you used "sys.stderr" rather than "sys.stderr"?
  1. The MockOS tests are clever and fast, but I'd have a bit more confidence if there were some functional tests to launch a twistd process and check its exit status. Maybe that's something for another ticket.

This branch has already gone through 4 reviews and the feature seems
to work well, so I'm happy for you to merge after addressing or
answering the points above and checking for some clean build results.

-RichardW.

comment:39 Changed 12 months ago by therve

  • Resolution set to fixed
  • Status changed from new to closed

(In [39314]) Merge twistd-wait-823-5

Author: therve
Reviewers: rwall, radix, tom.prince, jerub
Fixes: #823

twistd now waits for the application to start successfully before exiting after
daemonization.

Note: See TracTickets for help on using tickets.