Opened 11 years ago

Closed 5 years ago

Last modified 5 years ago

#307 enhancement closed fixed (fixed)

(don't) switch to logging.py / add support for logging.py

Reported by: glyph Owned by: vsajip
Priority: low Milestone:
Component: core Keywords: documentation
Cc: glyph, exarkun, spiv, acapnotic, itamarst, etrepum, moshez Branch:
Author: Launchpad Bug:

Description


Change History (12)

comment:1 Changed 11 years ago by glyph

I don't think we should do anything about this, but this issue should serve to
archive the case for and against, and perhaps the discussion will come to some
resolution as to how this should be architected.

(Sorry for the long initial nosy list, I tried to remember off the top of my
head who had been interested in the issue in the past.  Please remove yourself
if it's not relevant.)

The system in question is the new "logging" module in 2.3, described at

http://www.red-dove.com/python_logging.html

and at

http://www.python.org/doc/2.3.1/lib/module-logging.html

comment:2 follow-up: Changed 11 years ago by glyph

Here are my objections:

1. my most major one: the .ini configuration language, of which there is a big,
juicy example on the red-dove page, is totally gross.  Not only is it a classic
example of inappropriate use of a half-assed sub-language for configuration, the
syntax is confusing enough to rival apache.conf - the comments in the example
file are long, and mostly of the form, "This isn't X, as you think it might be,
but Y, which is similar but different in the following ways..."

2. Blocking.  One feature that is often touted in favor of the use of the
logging module is the multiple backends it supports, most of which are
networked.  These are no good because they all block; we'd need to rewrite them
using Twisted anyway, and we might as well implement to our own (simpler) API,
that can cope with errors and use Deferreds and such.  There are also a few uses
of threading which would be outside the Twisted threadpool system.  (Our own
threads are bad enough.  I don't want to debug thread interactions with the
logging system.)

3. No support for structured objects.  This leads to 2 problems:

3a. No support for system metrics.  Because you can't log structured data,
there's no way to install an observer which watches all events of a particular
type and aggregates numerical data associated with them as the server is running.

3b. Hard to avoid string formatting work when unnecessary.  Ideally, systems
which want to log a lot in debug mode and log only a little in production mode
ought to be able to call the logging system with a format string and some
description data, and then let it decide whether to do work or not.  However,
with the "logging" module, you need to do most of the string formatting work up
front, passing a complete message to the system, rather than a message format
and some arguments to populate it with in the event that it should actually be
logged.

(Some people have complained to me that it actually *does* support structured
data, considering that you can subclass LogRecord, but that does not look well
supported and LogRecord itself is ugly.  8 attributes for every log message? 
name, lvl, pathname, lineno, msg, args, exc_info ...?  That looks dangerously
introspective to me.  Also, LogRecord could easily have been a dict rather than
an instance; logging tends to be HIGHLY performance critical so little details
like that bother me.)

4. Hard-coded log levels.  This is a more personal objection, but I've never
seen a sensible way to distinguish between messages which should be "INFO" and
which should be "DEBUG".  It seems to boil down to personal style differences,
which means that in the same application you will get different levels of
verbosity from the logging system depending on which programmer was coding. 
Also, I dislike the workflow that a "DEBUG" loglevel encourages.  Information
that you need while initially debugging an application should be *removed*;
leaving it in is like (although obviously not as bad as) releasing your code
with 'pdb.set_trace()' calls in it.

Our current logging system is simple and elegant.  It needs to be better
documented, and perhaps broken down into a hierarchy to avoid spurious
notifications (for slightly more efficiency).

One good thing that the logging module has going for it that we don't is the
convention of using
  log = logging.getLogger("MyModule")
rather than just
  from twisted.python import log

Please feel free to post messages to this issue for further discussion.

comment:3 Changed 8 years ago by exarkun

  • Component set to conch
  • Resolution set to fixed
  • Status changed from new to closed

I moved these comments to TwistedLogging so that I could resolve this ticket.

comment:4 Changed 8 years ago by exarkun

  • Component changed from conch to core
  • Keywords documentation added

comment:5 in reply to: ↑ 2 Changed 5 years ago by vsajip

  • Resolution fixed deleted
  • Status changed from closed to reopened

Replying to glyph:

Here are my objections:

  1. my most major one: the .ini configuration language, of which there is a big,

juicy example on the red-dove page, is totally gross. Not only is it a classic
example of inappropriate use of a half-assed sub-language for configuration, the
syntax is confusing enough to rival apache.conf - the comments in the example
file are long, and mostly of the form, "This isn't X, as you think it might be,
but Y, which is similar but different in the following ways..."

You don't have to use the configuration system at all - if you don't import logging.config, none of it gets loaded. You can configure the logging system programmatically, and if you want config files but can't stomach ConfigParser or the particular schema that logging.config uses, you can use your own. The logging.config functionality is part of "batteries included" and so provides a reasonable base set of functionality which is more suited for novice/casual users than sophisticated users like the Twisted developers.

  1. Blocking. One feature that is often touted in favor of the use of the

logging module is the multiple backends it supports, most of which are
networked. These are no good because they all block; we'd need to rewrite them
using Twisted anyway, and we might as well implement to our own (simpler) API,

But you haven't got any alternatives from what I could see from the Twisted documentation - the basic logObserver described writes to a file. More importantly, I don't see how you get from "these handlers don't work for us because they block, we'd have to write some new ones" to "let's not use Python logging at all (because some of the handlers included block)". Plus you do supply a Python logging observer anyway, though do warn users against using blocking handlers. The users which use the PythonLoggingObserver functionality - which handlers do they use?

that can cope with errors and use Deferreds and such. There are also a few uses
of threading which would be outside the Twisted threadpool system. (Our own
threads are bad enough. I don't want to debug thread interactions with the
logging system.)

What uses of threading? Logging doesn't spawn any threads. It does use threading locks so that people who use threads don't have to worry about threading issues. Each handler instance creates a lock in __init__() (via createLock(), which you can override) and each event "emit" operation is bracketed by an acquire() and release() method of the handler (which delegates to the lock). If you subclass and make these no-ops, then there are no threading interactions of any kind, as long as the handler's emit() doesn't have any mutable state other than on the stack.

  1. No support for structured objects. This leads to 2 problems:

3a. No support for system metrics. Because you can't log structured data,
there's no way to install an observer which watches all events of a particular
type and aggregates numerical data associated with them as the server is running.

3b. Hard to avoid string formatting work when unnecessary. Ideally, systems
which want to log a lot in debug mode and log only a little in production mode
ought to be able to call the logging system with a format string and some
description data, and then let it decide whether to do work or not. However,
with the "logging" module, you need to do most of the string formatting work up
front, passing a complete message to the system, rather than a message format
and some arguments to populate it with in the event that it should actually be
logged.

(Some people have complained to me that it actually *does* support structured
data, considering that you can subclass LogRecord, but that does not look well
supported and LogRecord itself is ugly. 8 attributes for every log message?
name, lvl, pathname, lineno, msg, args, exc_info ...? That looks dangerously
introspective to me. Also, LogRecord could easily have been a dict rather than
an instance; logging tends to be HIGHLY performance critical so little details
like that bother me.)

Ordinarily, in at least some of their scripts, people find it useful to get things like filename, lineno etc. and this requires some introspection. Just because you don't want to use it, doesn't mean it shouldn't be there for everyone else who does want it. And if you don't want it, fine - there are mechanisms for avoiding all that expensive introspection. If you set logging._srcFile to None, logging doesn't collect caller info (filename, lineno, funcname, module etc.). If you set logging.logThreads to 0, it won't bother to get the current thread identity. If you set logging.logProcesses to 0, it won't collect process ID information.

As far as performance goes, you presumably know what <a href="http://en.wikipedia.org/wiki/Program_optimization#When_to_optimize">Knuth said</a>. Any evidence that LogRecord not being a dict is causing a (quantitatively) perceptible problem? If so, I'll certainly look into it.

If you think subclassing LogRecord is not well supported, please provide some specific evidence for this. What do you need that it doesn't do?

Plus, there is another way in which you can use logging. Instead of passing message strings and perhaps-expensive-to-compute parameters, just pass <strong>any object</strong> which will respond to __str__() and return a suitable string when it's time to do so. And if you use a suitable handler and formatter, the thing need never be converted to text at all (for example, SocketHandler just pickles and sends the LogRecord).

This would appear to cater for a generic "structured data" requirement.

  1. Hard-coded log levels. This is a more personal objection, but I've never

seen a sensible way to distinguish between messages which should be "INFO" and
which should be "DEBUG". It seems to boil down to personal style differences,
which means that in the same application you will get different levels of
verbosity from the logging system depending on which programmer was coding.
Also, I dislike the workflow that a "DEBUG" loglevel encourages. Information
that you need while initially debugging an application should be *removed*;
leaving it in is like (although obviously not as bad as) releasing your code
with 'pdb.set_trace()' calls in it.

Okay, so just use INFO and ERROR levels to map to your msg() and err(). Some people want more levels, and experience has shown that the 5 levels provided strike a reasonable balance. You don't really have to use the other levels, and they won't get in your way. And in what sense are they hardcoded? It seems to me that it's just as valid to say that msg() and err() effectively hardcode two levels. And apart from the debug()}}, {{{info(), warning() etc. convenience methods, there's also a generic log() method which takes a level argument.

Our current logging system is simple and elegant. It needs to be better
documented, and perhaps broken down into a hierarchy to avoid spurious
notifications (for slightly more efficiency).

So that's Not Invented Here, then? Are you saying that the design of the logging package is not simple and elegant? Please give particulars, remembering that design is not a black and white process. You're saying that Twisted has no warts? Or if it does, that's a good reason for no-one to use it?

Python logging fits in 3 source files at under 1500 SLOC and does a hell of a lot, even though I do say so myself. Compare log4j: for approximately the same functionality, or perhaps slightly more, it weighs in at around 168 source files and around 16K SLOC. Almost all of the functionality I mentioned above has been in the package from the first release (Python 2.3).

I feel I have given a reasonable rebuttal to all your points, but I'm happy to continue the discussion. There's <a href="http://wiki.python.org/moin/LoggingPackage">a page</a> on the Python Wiki which is therefore discussions/complaints about the logging package, please feel free to visit there, look at other criticisms and the points I make about them, and update that page with additional points if you deem it necessary.

Reopening this ticket.

comment:6 Changed 5 years ago by zooko

You folks might also be interested in foolscap logging -- http://foolscap.lothar.com/docs/logging.html -- it is asynchronous (twisted based) including a remote protocol, it is structured object based, and it has a slew of interesting features such as "triggered event logging", where when something really bad happens then it writes down all of the mundane things that had been happening a second before, but when nothing bad happens then it doesn't waste everyone's bandwidth writing down every mundate thing.

comment:7 Changed 5 years ago by glyph

  • Owner changed from glyph to vsajip
  • Status changed from reopened to new

comment:8 Changed 5 years ago by exarkun

So, what resolution can we hope for on this ticket? I'm specifically motivated to ask this because vsajip is now the owner, and I didn't get the impression from his comment that he was interested in actually writing any code for Twisted. Combining that with the fact that he seems to have responded to the points against logging.py earlier on the ticket, I'm left unsure what he will do to move this forward.

As vsajip pointed out, since this ticket was closed, we did add support for passing messages from twisted.python.log into the logging system. That's some kind of "support" for logging.py. What more support is implied by this ticket?

And what would "switching" entail? We need to preserve the twisted.python.log API for backwards compatibility. Should it just back-end directly onto logging? What benefits does this confer? It sounds like the two systems are roughly at feature parity. Perhaps the benefit is less code to maintain?

Let's try to define some specific goals which can be accomplished so that this ticket doesn't just sit open forever.

comment:9 Changed 5 years ago by glyph

My understanding is that a resolution of this ticket would be to deprecate twisted.python.log and encourage people to use the logging module instead, replacing all internal usage in Twisted. The chief motivating argument that I can see right now would be to reduce the amount of code we're maintaining, but logging sounds like it might have more features than twisted.python.log as well.

That said, I'm not completely sold that this would be a good idea. I'm happy to hear that the logging module isn't as bad as I thought, but I still aesthetically prefer Twisted's own thing. I think there are twisted-y, networky things that we might do one day that would be easier if we had our own logging infrastructure. However, I've been thinking that for several years and it's never actually happened; the closest thing is the foolscap stuff that zooko mentioned, which is yet another logging system unrelated to these two. So I'm willing to consider patches which do this, assuming logging is in fact a feature-complete replacement.

I reassigned this to vsajip because he seems to have a vested interest in it. I got the impression that he was potentially interested in writing some logging code for Twisted, which is why I pointed him at this ticket. If he doesn't make something happen on this front, I'm certainly not going to :-).

So, to be explicit: vsajip, if you do not eventually intend to write both a detailed description of why it would be worthwhile to do the work to end-of-life our own logging system, as well write the not-insubstantial code for Twisted to help this ticket get resolved, please close it again. As far as I know, nobody on the Twisted core team is motivated to work on it. (Of course if you do write some code I would not be opposed to reviewing it and providing feedback.)

comment:10 Changed 5 years ago by vsajip

  • Resolution set to fixed
  • Status changed from new to closed

Glyph,

I reopened this ticket because you said (here):

It's quite likely that those comments are badly out of date by now. If you'd like to re-open that ticket, it's always possible that we could drop our logging system entirely in favor of Python standard logging :).

I didn't get the sense from that paragraph (or anywhere else in your comment) that you were expecting me to do any work other than commenting on it. I only flagged it up because in another, pre-Tornado post you made about logging in general, you gave a long wish-list of things you thought a good logging system should have, and then said

I'm not aware of any logging system that can already do these things.  We'd have to write a new one.

So I was trying to set the record straight and update the state of your awareness ;-), which I hope I have done by commenting here on this ticket.

I quite understand the exigencies of backward compatibility, and I am in no way suggesting that I want to see twisted.log replaced root and branch by Python's built-in logging. That would appear to go against "pragmatism beats purity" and involve a lot of work for no particular (practical) benefit. As exarkun has pointed out, your inclusion of the PythonLoggingObserver provides a bridge to Python logging - and I don't see that any more is necessary unless your users are asking for it.

I just want to be sure that if you don't want to use Python logging, it's not because it's somehow deficient in the main - rather, it's because you don't need the features it offers, don't want backward compatibility headaches and feel that it doesn't mesh with your aesthetic sensibilities. All of which is fine by me. I'm just engaging with you to make sure there's nothing I can readily do to improve Python logging for your needs - just the type of engagement you wish had happened between the Twisted team and the FriendFeed guys when Tornado was just a gleam in their eye.

Closing this ticket again :-)

comment:11 follow-up: Changed 5 years ago by glyph

vsajip,

I guess it's a good thing that I made it explicit, then :).

I appreciate the new information. If you have more thoughts about Twisted and stdlib logging that you feel are relevant to the implementation of Twisted, you may feel free to add further comments to this ticket without reopening it :).

We do have certain requirements for backwards compatibility, and twisted.python.log is an old module, but it is by no means impossible to replace. I just want you to know that we do have a compatibility policy and that it is possible to change just about anything. All else being equal, "maintaining less code" is a good reason to remove things. Despite the headaches caused by the additional dependency, I'm glad every day that we dropped our crummy home-built component system and switched to Zope Interface.

Do you go to PyCon? I feel like we should get together and talk a bit more about what logging offers that Twisted might benefit from, regardless of this ticket. (Specifically, Twisted capturing logs from applications and libraries that log using logging: integration in the other direction from what we've got now.)

comment:12 in reply to: ↑ 11 Changed 5 years ago by vsajip

Replying to glyph:

I appreciate the new information. If you have more thoughts about Twisted and stdlib logging that you feel are relevant to the implementation of Twisted, you may feel free to add further comments to this ticket without reopening it :).

Sure thing.

We do have certain requirements for backwards compatibility, and twisted.python.log is an old module, but it is by no means impossible to replace. I just want you to know that we do have a compatibility policy and that it is possible to change just about anything. All else being equal, "maintaining less code" is a good reason to remove things. Despite the headaches caused by the additional dependency, I'm glad every day that we dropped our crummy home-built component system and switched to Zope Interface.

That's good to know. But in these matters, it's all a question of priorities, right? There's no need to do anything about it until (if ever) one of you feels that twisted.log maintenance is too much work.

Do you go to PyCon? I feel like we should get together and talk a bit more about what logging offers that Twisted might benefit from, regardless of this ticket. (Specifically, Twisted capturing logs from applications and libraries that log using logging: integration in the other direction from what we've got now.)

I'm based in Europe and so far have been unable to justify the cost and time for visiting PyCon, though I'd surely love to. I haven't even managed one of the European Python conferences yet :-(

I agree it would be good to talk. If you're ever in London, do let me know and we can meet up for a drink and a chat.

Best wishes,

Vinay Sajip

Note: See TracTickets for help on using tickets.