Ticket #727 (new enhancement )

Opened 4 years ago

Last modified 7 months ago

Compression for rotated log files

Reported by: itamarst Assigned to: therve
Type: enhancement Priority: normal
Milestone: Component: core
Keywords: Cc: itamarst, djk121, therve, exarkun
Branch: branches/compressed-logfiles-727 Author: therve
Launchpad Bug:

Attachments

727_logfile.diff (3.8 kB) - added by therve 2 years ago.

Change History

  2006-08-29 13:29:20+00:00 changed by therve

  • attachment 727_logfile.diff added

  2006-08-29 13:31:20+00:00 changed by therve

  • cc changed from itamarst, djk121 to itamarst, djk121, therve

I tried a simple implementation of this problem. Do not use this, it's just a sample of what can be done. I guess I have to use producer/consumer to create the compressed file.

TODO:

  • use consumer/producer
  • compress DailyLogFile

  2007-12-12 14:10:38+00:00 changed by therve

  • branch set to branches/compressed-logfiles-727
  • author set to therve

(In [22073]) Branching to 'compressed-logfiles-727'

  2007-12-12 14:20:48+00:00 changed by therve

(In [22074]) Add compressed version of LogFile? and DailyLogFile?, for gzip and bz2.

Refs #727

  2007-12-12 14:23:22+00:00 changed by therve

  • owner changed from itamarst to therve

  2007-12-14 09:31:34+00:00 changed by therve

(In [22084]) Try to update both log file and compressed version.

Unfortunately, it doesn't really work for bz2.

Refs #727

  2007-12-14 18:42:17+00:00 changed by therve

(In [22086]) Partial revert of r22084: I'll try something else.

Refs #727

  2007-12-15 18:47:09+00:00 changed by therve

(In [22087]) Use callInThread to call the compression.

Refs #727

  2007-12-15 19:01:34+00:00 changed by therve

(In [22088]) Make some cleanups

Refs #727

  2007-12-15 20:16:21+00:00 changed by therve

  • keywords set to review
  • owner deleted
  • priority changed from normal to highest

This is ready to review.

follow-up: ↓ 12   2007-12-28 22:09:04+00:00 changed by exarkun

  • keywords deleted
  • owner set to therve
  • twisted/python/logfile.py
    • I'm not sure why LogReader ever existed in the first place. Maybe it had something to do with buffering or something, but xreadlines was around a long time ago. It'd be nice to get rid of this and just say getLog returns a file-like object. Maybe that's not really related to this branch (although this does add two new LogReader? subclasses). Anyhow, I'm not even really sure that getLog was ever actually useful for anyone or anything.
    • CompressorHelper seems too clever. A mixin would work just as well here, wouldn't it? Also, mixing in object to classic hierarchies is scary (it causes problems if you ever decide to make the top new-style, which I hope we will decide to do someday).
    • twisted.python can't import stuff from twisted.internet - twisted.internet.reactor import in callInThread is bad. Maybe something with a callInThread method could be required as an __init__ argument? This should make the tests simpler too, since they won't have to subclass/override.
    • The _compress method is a bit scary.
      • it has a lot of code inside a try/except.
      • it imports twisted.python.log
      • there's no behavior for handling errors, just logging it. Not that I have any suggestions for what to do if there's an error. Note that if a callable passed to callInThread raises an unhandled exception, it'll get logged anyway. As an aside, new calls to log.err() should probably all be required to specify something for the _why parameter so they're easier to debug.
    • overall, new docstrings could be a bit more precise in documenting their parameters/return values
    • Some Python installations may not have gzip or bz2. Kinda stupid, but I guess we should think about it.

  2007-12-29 13:54:21+00:00 changed by therve

(In [22186]) Manage missing gzip/bz2.

Refs #727

in reply to: ↑ 10   2007-12-29 14:03:58+00:00 changed by therve

  • cc changed from itamarst, djk121, therve to itamarst, djk121, therve, exarkun

Replying to exarkun:

I'm not sure why LogReader ever existed in the first place. Maybe it had something to do with buffering or something, but xreadlines was around a long time ago. It'd be nice to get rid of this and just say getLog returns a file-like object. Maybe that's not really related to this branch (although this does add two new LogReader? subclasses). Anyhow, I'm not even really sure that getLog was ever actually useful for anyone or anything.

It's useful at least for one thing: tests. I don't think the behavior is really hard to support, so I would prefer to keep it.

CompressorHelper seems too clever. A mixin would work just as well here, wouldn't it? Also, mixing in object to classic hierarchies is scary (it causes problems if you ever decide to make the top new-style, which I hope we will decide to do someday).

I don't think I can use a mixin because the baseClass is different (LogFile and DailyLogFile). At least, I don't know how to do it. The object mixin should just be removed when we'll make the top new-style.

twisted.python can't import stuff from twisted.internet - twisted.internet.reactor import in callInThread is bad. Maybe something with a callInThread method could be required as an __init__ argument? This should make the tests simpler too, since they won't have to subclass/override.

OK, but I'll have to import the reactor at some point anyway, no ?

The _compress method is a bit scary. * it has a lot of code inside a try/except. * it imports twisted.python.log * there's no behavior for handling errors, just logging it. Not that I have any suggestions for what to do if there's an error. Note that if a callable passed to callInThread raises an unhandled exception, it'll get logged anyway. As an aside, new calls to log.err() should probably all be required to specify something for the _why parameter so they're easier to debug.

What do you suggest? Removing the log.err completely? This solution seems to solve all the above points :).

overall, new docstrings could be a bit more precise in documenting their parameters/return values

Right, I'll try to improve that.

Some Python installations may not have gzip or bz2. Kinda stupid, but I guess we should think about it.

Done, I think.

  2007-12-31 10:47:44+00:00 changed by therve

(In [22234]) Add some docstrings.

Refs #727

  2008-01-11 20:45:55+00:00 changed by therve

(In [22312]) Remove callInThread method, pass it as argument.

Refs #727

  2008-01-11 20:48:16+00:00 changed by therve

  • keywords set to review
  • owner deleted

This is ready to review. Looking at it now, the tests are maybe a bit complicated, so I could refactor them if necessary. But that's for what reviews exist, after all :).

  2008-04-20 17:13:20+00:00 changed by glyph

  • keywords deleted
  • owner set to therve

The biggest overall comment:

  • In order to actually be useful, this needs to be both selectable by a twistd command-line option and accessible to applications; really, this implies #2571.

Other, more minor stuff:

  • This does stuff in threads. Why threads? Subprocesses are great, right? Maybe it's because we don't have a good process pool. Sounds like another dependency.
  • This adds more code that uses os.path rather than FilePath
  • Many of the general refactorings in this branch should be separated out from this feature.
  • I think there's a problem with some potential new error conditions being introduced in the compress step but I'm not sure. Threads plus file I/O plus random C APIs definitely makes me uncomfortable though :) The potential race with the os.access LBYL seems like it might be worse if you are spending more time "doing stuff" during compression. What happens when you run out of disk during compression?

  2008-04-20 18:06:53+00:00 changed by therve

  • priority changed from highest to normal

  2008-04-20 18:06:54+00:00 changed by therve

  • priority changed from highest to normal
Note: See TracTickets for help on using tickets.