Opened 9 years ago

Closed 5 years ago

Last modified 5 years ago

#972 enhancement closed fixed (fixed)

Add linux inotify support to twisted core

Reported by: exarkun Owned by:
Priority: highest Milestone:
Component: core Keywords:
Cc: exarkun, spiv, hagbard, dialtone, dripton, matthewg, thijs, itamarst, moonfallen, free.ekanayaka, amtota, ther Branch: branches/inotify-support-972-3
(diff, github, buildbot, log)
Author: exarkun, dialtone Launchpad Bug:

Change History (39)

comment:1 Changed 9 years ago by exarkun

Initial code is checked in at sandbox/inotify.py

comment:2 Changed 9 years ago by hagbard

here's the fix to make it work with python < 2.3

Index: inotify.py
===================================================================
--- inotify.py	(revision 13515)
+++ inotify.py	(working copy)
@@ -130,7 +130,7 @@
         request = array.array('c', struct.pack(structWatchRequest,
            pathBuffer.buffer_info()[0], mask))
 
-        handle = fcntl.ioctl(self.fd, INOTIFY_WATCH, request, 1)
+        handle = fcntl.ioctl(self.fd, INOTIFY_WATCH, request.buffer_info()[0])
 
         port = iNotifyPort(path, handle, iNotifyFactory, self)
 
@@ -142,7 +142,7 @@
             return
         del self.ports[handle]
         request = array.array('c', struct.pack(structWatchIgnore, handle))
-        return fcntl.ioctl(self.fd, INOTIFY_IGNORE, request, 1)
+        return fcntl.ioctl(self.fd, INOTIFY_IGNORE, request.buffer_info()[0])
 
     def connectionLost(self, reason):
         for handle in self.ports.keys():

comment:3 Changed 8 years ago by spiv

  • Component set to conch
  • Summary changed from Linux inotify support in twisted core to Add linux inotify support to twisted core

comment:4 Changed 8 years ago by exarkun

  • Component changed from conch to core

comment:5 Changed 6 years ago by dripton

  • Cc dialtone added

I tried /sandbox/dialtone/inotify.py and it works well for me, in a real application. The API is well documented and makes sense. I don't love having to pass a list of (callback, errback) tuples to the callbacks argument of the watch method, but it works, and I can't think of an API that's just as simple and allows the same multi-call functionality.

comment:6 Changed 6 years ago by dripton

  • Cc dripton added

comment:7 Changed 6 years ago by matthewg

  • Cc matthewg added

comment:8 Changed 6 years ago by dialtone

  • Author set to dialtone
  • Branch set to branches/inotify-support-972

(In [26155]) Branching to 'inotify-support-972'

comment:9 Changed 6 years ago by dialtone

  • Keywords review added
  • Owner changed from hagbard to dialtone
  • Priority changed from normal to highest
  • Status changed from assigned to new

The branch is ready for review by anyone.

comment:10 Changed 6 years ago by thijs

  • Cc thijs added
  • Keywords review removed
  1. Replace the copyright headers with the default Twisted copyright headers
  2. Also add a module docstring at the top describing the functionality and include a @since: 9.0 epydoc marker

comment:11 Changed 6 years ago by dialtone

  • Keywords review added

Both the points in thijs review were addressed in r26287 in the branch.

comment:12 Changed 6 years ago by dialtone

  • Branch changed from branches/inotify-support-972 to branches/inotify-support-972-2

(In [26288]) Branching to 'inotify-support-972-2'

comment:13 Changed 6 years ago by jerub

  • Keywords review removed

I have reviewed this code, here are my comments.

There is no unit test for when twisted.internet.inotify raises a SystemError. Either by ctypes not being available (this happens on import) or when INotify is created.

SystemError is very rarely used in twisted, so much so that it only occurs in cfsupport, inotify and test_process. Even if it is the correct exception to use, I don't like it being raised on import.

The constants defined at the top of the module for IN_ACCESS, etc, have extra spaces. This is contray to what PEP-8 says about whitespace in expressions, I would prefer to see thoses spaces collapsed to 1.

_FLAGS_TO_HUMAN is a dict, but is never treated like a dict, only a list of tuples. I suggest it is transformed to a list of tuples and used as one - as dicts don't have a stable order, and it would be nice if flagsToHuman returned elements in the correct order.

flagsToHuman should probably have a nicer name, as it's an externally exported function that will be used by developers.

There is no test for the IOError raised in INotify._addWatch.

INotify.notify is a dummy print statement and doesn't have a test, I don't think it should be included, or should be included in documentation.

for function in "inotify_add_watch inotify_init inotify_rm_watch".split(): should be a list or tuple containing the strings, instead of a string that we split().

_rmWatch(self, wd) has a line del iwp which is in effect a noop, it just removes a name from the locals of a method that is just about to have its locals removed.

INotify.release checks hasattr(self, '_fd'). It is impossible to get to the bottom of __init__ without this attribute being on the object, so the check isn't required.

        while True:
            if len(self._buffer) < 16:
                break

should perhaps be expressed as:

        while len(self._buffer) >= 16:

INotify._doRead handles a case where it gets a notification of something it's not caring about by doing

            try:
                iwp = self._watchpoints[wd]
            except:
                continue # can this happen?

I suggest that the exact exception where a key is not in a dictionary is caught, instead of a bare except:.

I really don't like defining _addChildren in _doRead, it makes the entire method very complicated, if you can refactor it so that _addChildren is a method of INotify, I would much prefer that.

Please make these changes and resubmit for review.

comment:14 Changed 6 years ago by dialtone

  • Keywords review added
  • The exception raised on import is now an ImportError.
  • addressed all the style issues, bare excepts, inner functions, redundant dels and other problems
  • added tests for IOError and SystemErrors raised during init and during _addWatch
  • flagToHuman() renamed humanReadableMask()
  • got rid of notify dummy method, added to module docstring.

comment:15 Changed 6 years ago by dialtone

  • Owner changed from dialtone to jerub

comment:16 Changed 6 years ago by exarkun

  • Owner changed from jerub to exarkun
  • Status changed from new to assigned

comment:17 Changed 6 years ago by exarkun

  • Keywords review removed
  • Owner changed from exarkun to dialtone
  • Status changed from assigned to new
  1. twisted/internet/inotify.py
    1. Copyright date probably just needs to include 2009, or maybe 2008 and 2009; how old is this code? :)
    2. Prevailing convention in Twisted is to let ImportError from a missing dependency pass through unmodified. Please get rid of the try/except around the ctypes imports.
    3. Each method should be separated from the last by two blank lines. Each class by three blank lines.
    4. SystemError is documented to have a meaning which is incompatible with the one to which INotify._inotify__init__ puts it. I suggest a RuntimeError instead.
    5. The reactor ''must'' be a parameter to INotify.__init__. All new Twisted code should discourage use of the global reactor in favor of a parameterized reactor. This eases testing as well as allows for the eventual possibility of not having a single global reactor. Note that FileDescriptor.__init__ sets the self.reactor to whatever is passed to it, so you can use that attribute elsewhere rather than re-importing the global reactor or adding another attribute to INotify.
    6. I think it would be preferable if reactor.addReader were called by the code which instantiates an INotify object, rather than in INotify.__init__. This is a bit hairy, though:
      1. POSIX stdio support is like INotify is now: it does ''two'' addReader calls in __init__; the caller can't easily do these, since the interface doesn't allow access to both necessary file descriptors.
      2. Windows stdio support kind of is too, except it uses timers instead of addReader
      3. TCP, SSL, UNIX ports don't do addReader in __init__, instead they do it in startListening.
      4. TCP, SSL, UNIX servers '''do''' call addReader in __init__. TCP, SSL, UNIX clients ''kind of'' call addReader in __init__ (they sometimes do it asynchronously though).
      So actually it's a big mess. I suppose you can leave INotify how it is, since there's no clear convention either way (or if anything, the convention is to call it in __init__). However, if you want, you can also change it. I think it's cleaner to separate initialization from state mutation.
    7. _buffer, _watchpoints, and _watchpaths should be documented in the class docstring rather than with brief comments above their initialization
    8. libc could be loaded just once and shared between all instances of INotify; probably not a huge savings, but I bet calling find_library less often might actually be noticable (or at least measurable ;)
    9. INotify.release would probably be better named INotify.connectionLost; the inherited loseConnection should take care to call it at the right times. This will also make reactor.stop() do the right thing (probably not a big deal, since usually you just exit after reactor.stop(), but still kind of nice).
  2. twisted/internet/test/test_inotify.py
    1. Copyright date, again.
    2. Module and class docstrings are missing.
    3. two blank lines between methods
    4. Lots of tests use all-caps variables; these should just be named normally.
    5. assertTrue is preferred over assert_
    6. test_watchPoint and test_humanReadableMask docstrings need a lot of help. :) In general, a lot of the test method docstrings aren't written in the currently preferred style of "X does Y" or "When X, Y." etc. "Test that ..." is redundant in a test method docstring. The more detail the better, this stuff is how we'll manage to maintain the code down the road. :)
I didn't quite finish this review, but it's pretty late now so I'll have to stop. It seems like all the issues I've noticed so far are fairly easily surmounted, and for some reason the code feels pretty good, so congrats on that. :) I'll probably have some further comments to make (in particular, I want to think about how to make it easy to support Linux and Windows with minimal application differences) in a future review, but hopefully nothing drastic.

comment:18 Changed 6 years ago by dialtone

  • Keywords review added
  • Owner changed from dialtone to exarkun

Addressed all exarkun points in r26295. I'll be happy to receive further reviews about future compatibilities or other problems with the code.

comment:19 Changed 5 years ago by exarkun

  • Keywords review removed
  • Owner changed from exarkun to dialtone
  1. Some lines in each new file have trailing whitespace, these should be removed
  2. FilePath is imported in inotify.py but not used
  3. inotify.py uses \-style line continuations to define IN_WATCH_MASK. It should use () instead.
  4. The mess at the beginning of test_inotify.py to determine if inotify is supported on the current platform suggests to me that there should be an API to help determine this. Applications may want to fail gracefully in the absence of inotify support, and they shouldn't have to do all that.
  5. The last two lines of INotify.connectionLost are not covered by the tests. I suspect that code should also be more careful about which exceptions it silently discards. Which OSError does it expect? If it gets a different kind, it should probably at least log it (but not raising it out of connectionLost is good).
  6. It would be nice to expose the errno value when libc calls fail. Is this possible with ctypes? If not, we should file a ticket upstream requesting this feature. Related, rather than raising RuntimeError in one place where an inotify call fails, IOError in another, and ignoring the result in a third (_rmWatch!!), how about a new exception type for inotify errors and raising that everywhere?
  7. The # can this happen comment in _doRead seems superfluous - apparently it can happen, as the test suite manages to exercise that codepath.
  8. The if path.isdir(): check in _doRead always succeeds in the test suite. That is, deleting it and leaving the code it guards leaves the test suite passing completely. Can you add a test which relies on it failing to do something different?
  9. The module should have an __all__ definition.
  10. inotify_add_watch seems to return a wd of 1 the first time it is called, but the documentation doesn't seem to guarantee that it won't start at 0 instead. If it returns 0, INotify.ignore will break for that path.
  11. A bit more documentation would be nice. Either in the form of an example or a howto or both. A howto can be developed separately (please file a ticket) if you prefer. It would be nice to resolve this soon, though.

I've changed my mind about worrying about cross platform compatibility. If someone implements something for another platform in the future, then we can worry about adding a compatibility layer on top of the two low-level APIs then.

comment:20 Changed 5 years ago by itamar

  • Cc itamarst added

Apparently someone at ITA is interested in using this when it's done.

comment:21 Changed 5 years ago by dialtone

  • Keywords review added

So, I think I've addresses essentially all of exarkun's observations on this ticket. Two of them require some explanation (that is also present in the source).

  • Observation 8: That check was actually redundant since the mask for getting in that if already checks for a directory, no need to introduce extra code or extra test, simply removing the line is good enough.
  • Observation 11: I'm not sure how much more documentation could be written honestly... INotify has roughly 3 or 4 API calls and they are explained in the module docstring with the short example. I would love external documentation instead and I would file a ticket for that once this one is approved for merge.

comment:22 Changed 5 years ago by exarkun

  1. There's no test coverage for the if callback is not None: line in _Watch._notify. ie, the tests never invoke the condition where the test evaluates to false.
  2. INotify._addWatch and INotify._rmWatch docstrings need some help. Describe what the abstraction does. :) At a glance, they seem to take care of calling the libc function with ctypes, checking for an error condition, and updating _watchpoints and _watchpaths to reflect the new state on the inotify file descriptor. Additionally, _addWatch can raise an exception, this should be documented too.
  3. INotify.connectionLost should pass a 2nd positional argument to log.err - this argument should be a short string describing the source of the error. In this case, something like Closing inotify file descriptor failed. It will be included ahead of the logged exception.
  4. INotify.watch documents the type of its mask parameter as... hex. int would be more correct, I think?
  5. In the tests, assertTrue is used in various places where assertEquals is probably intended:
    1. test_watchPoint
    2. test_humanReadableMask (twice)
    3. test_complexSubdirectoryAutoAdd
  6. test_simpleDeleteDirectory has a print in it.

I'm not sure, but I might have some comments about the test code as well. It's sleep time now, however.

comment:23 Changed 5 years ago by dialtone

Addressed also all of the new comments on this ticket branch.

On point 4 however I kept both hex and int because the masks, as they have been set up in the module, are hex, of course however hex is as good as int, although it's usually easier to picture them has hex numbers.

comment:24 Changed 5 years ago by exarkun

  • Keywords review removed
  1. Some problems on buildbot, related to the next point.
  2. twisted/python/runtime.py - I am intrigued by this change.
    1. It provides the simple API I was looking for.
    2. But it includes an import going the wrong way - twisted.python may not depend on twisted.internet. Having the import nested in a function vaguely alleviates this, but not sufficiently, I think.
    3. Additionally, calling it will install the default reactor if one is not yet installed.
    4. Yet it seems the API can't be in twisted/internet/inotify.py, since the import errors that module might raise are half the motivation for the API.
    5. Maybe... at module-scope in inotify.py, _inotify__init__ can be invoked somehow without importing the reactor (maybe it needs to be a class method or something). Then the way to deal with possibly missing inotify support would be to just handle ImportError or INotifyError from importing twisted.internet.inotify. Perhaps even better would be to turn INotifyError into ImportError (preserving the message to remain informative)? Then callers only need to handle ImportError. I'm mostly thinking out loud here. Another possibility is that we don't really need to do all this work at all. For example, I just noticed that Twisted's epoll support isn't nearly this easy to deal with (see the end of twisted/test/test_epoll.py).
  3. twisted/internet/inotify.py
    1. I suspect the notify fd should be set close on exec.
    2. There is no hex type in Python. 0x10 is 16. I know you know this, so I'm confused why you want to keep C{hex} in the @type for the mask parameter of watch. :)
    3. The note in _addChildren about being resilient against duplicate events probably merits a more prominent location. I suggest moving it the docs for the autoAdd parameter of watch.
    4. watch returns the _Watch it creates... sometimes. The tests don't seem to cover this, and it isn't documented, so I'd suggest just never returning it.
  4. twisted/internet/test/test_inotify.py
    1. The general form of these tests is a bit annoying. Do something that triggers inotifications, define a nested function in the callback, schedule the nested function to be run after a delay, handle exceptions in the nested function by errbacking a Deferred. Seems error prone and very boilerplate-heavy. Probably some refactoring could shorten the tests a bunch, but that's not really my major concern here. What I'm most concerned with is the delayed calls which fire after 0.1 seconds. What's up with these? It's not just that they slow down the test suite, but that I'm pretty sure they are involved in some race condition which is going to go the wrong way sometimes. Can we get rid of that race condition?
    2. Only the tests for _Watch.addCallback actually invoke that method. And aside from the sometimes-returned _Watch from watch I commented on above, there seems to be no way to get a _Watch. So is _Watch.addCallback really needed?

I'm not sure I'm totally convinced autoAdd and recursive belong here, but we'll give it a try. :) I hope this is the last of my feedback, and the next review will just be ok, merge. :)

comment:25 Changed 5 years ago by moonfallen

  1. Some problems on buildbot, related to the next point.
    • Still needs to be done? I will need some tips on running buildbots with a specific branch. All I saw at that link was red Windows, presumably tests need to be skipped on Windows.
  1. twisted/python/runtime.py - I am intrigued by this change.
    1. It provides the simple API I was looking for.
      • OK
    2. But it includes an import going the wrong way - twisted.python may not depend on twisted.internet. Having the import nested in a function vaguely alleviates this, but not sufficiently, I think.
    3. Additionally, calling it will install the default reactor if one is not yet installed.
    4. Yet it seems the API can't be in twisted/internet/inotify.py, since the import errors that module might raise are half the motivation for the API.
    5. Maybe... at module-scope in inotify.py, _inotifyinit can be invoked somehow without importing the reactor (maybe it needs to be a class method or something). Then the way to deal with possibly missing inotify support would be to just handle ImportError? or INotifyError from importing twisted.internet.inotify. Perhaps even better would be to turn INotifyError into ImportError (preserving the message to remain informative)? Then callers only need to handle ImportError. I'm mostly thinking out loud here. Another possibility is that we don't really need to do all this work at all. For example, I just noticed that Twisted's epoll support isn't nearly this easy to deal with (see the end of twisted/test/test_epoll.py).
  1. twisted/internet/inotify.py
    1. I suspect the notify fd should be set close on exec.
    2. There is no hex type in Python. 0x10 is 16. I know you know this, so I'm confused why you want to keep C{hex} in the @type for the mask parameter of watch. :)
    3. The note in _addChildren about being resilient against duplicate events probably merits a more prominent location. I suggest moving it the docs for the autoAdd parameter of watch.
      • The note seems to have been removed from there, although I don't see it anywhere else.
    4. watch returns the _Watch it creates... sometimes. The tests don't seem to cover this, and it isn't documented, so I'd suggest just never returning it.
      • Still needs to be done.
  1. twisted/internet/test/test_inotify.py
    1. The general form of these tests is a bit annoying. Do something that triggers inotifications, define a nested function in the callback, schedule the nested function to be run after a delay, handle exceptions in the nested function by errbacking a Deferred. Seems error prone and very boilerplate-heavy. Probably some refactoring could shorten the tests a bunch, but that's not really my major concern here. What I'm most concerned with is the delayed calls which fire after 0.1 seconds. What's up with these? It's not just that they slow down the test suite, but that I'm pretty sure they are involved in some race condition which is going to go the wrong way sometimes. Can we get rid of that race condition?
      • re: race condition removal, Still needs to be done.
    2. Only the tests for _Watch.addCallback actually invoke that method. And aside from the sometimes-returned _Watch from watch I commented on above, there seems to be no way to get a _Watch. So is _Watch.addCallback really needed?

TODO

To summarize, this still seems to be needed, from my reading.

  1. Make buildbots skip this on windows. (Try to run a windows buildbot on this branch, or just run windows tests on this branch I guess.)
  1. Probably don't return _Watch instance from .watch() since it has no tests or apparent use. (Evaluate apparent use.)
  1. twisted/internet/test/test_inotify.py: fix apparent races arising from use of .callLater

comment:26 follow-up: Changed 5 years ago by moonfallen

  • Cc moonfallen added
  • Milestone set to Twisted-10.0

comment:27 Changed 5 years ago by free.ekanayaka

  • Cc free.ekanayaka added

comment:28 Changed 5 years ago by free.ekanayaka

The branch looks much better after the recent modifications from exarkun.

@exarkun: when you think this is ready to be reviewed again I can help with that.

(This would be my first Twisted review, mainly motivated by the fact I'd need inotify support myself. But it can be a start for others :)

comment:29 Changed 5 years ago by amtota

  • Cc amtota added

comment:30 in reply to: ↑ 26 Changed 5 years ago by therve

So, I did a couple of cleanups. It seems to pass on Windows now. I removed the values passed to callLater. I looked at modifying watch to not return a value, but it's needed by the code itself. I think it's fine that it's not documented, because users shouldn't use it. Maybe there is a slightly better solution, but I didn't find it yet.

comment:31 Changed 5 years ago by therve

  • Cc ther added
  • Keywords review added

comment:32 Changed 5 years ago by therve

  • Owner dialtone deleted

comment:33 Changed 5 years ago by exarkun

  • Owner set to exarkun
  • Status changed from new to assigned

comment:34 Changed 5 years ago by exarkun

  • Description modified (diff)
  • Keywords review removed
  • Milestone Twisted-10.0 deleted
  • Owner changed from exarkun to therve
  • Status changed from assigned to new
  1. supportsINotify should have a @since tag
  2. All the functions in twisted.python._inotify should have docstrings
  3. twisted.internet.inotify has @since but I guess we've missed 10.0 for this.
  4. There should be a ticket for the errno issue in twisted.python._inotify.remove, referenced from the comment in that function.

If buildbot is green, please merge.

comment:35 Changed 5 years ago by exarkun

  • Author changed from dialtone to exarkun, dialtone
  • Branch changed from branches/inotify-support-972-2 to branches/inotify-support-972-3

(In [28488]) Branching to 'inotify-support-972-3'

comment:36 Changed 5 years ago by therve

  • Resolution set to fixed
  • Status changed from new to closed

(In [28495]) Merge inotify-support-972-3

Authors: dialtone, exarkun, moonfallen, therve
Reviewers: jerub, exarkun
Fixes #972

Add linux inotify support, allowing monitoring of file system events.

comment:37 Changed 5 years ago by glyph

  • Description modified (diff)

I couldn't find a ticket for cross-platform filesystem change notifications. Is there one? If not, somebody should file it. In case somebody does feel like filing it, here are two relevant resources:

comment:38 Changed 5 years ago by exarkun

Oh. That's where I wrote that first review comment. I thought Firefox ate it.

comment:39 Changed 4 years ago by <automation>

  • Owner therve deleted
Note: See TracTickets for help on using tickets.