[Twisted-Python] the good, the log, and the ugly

Glyph glyph at twistedmatrix.com
Sun Mar 25 22:11:52 EDT 2012

Right now, we have a pretty bad logging idiom.  All over Twisted, you'll find stuff like this:

log.msg("twistd %s (%s %s) starting up." % (copyright.version,
log.msg("Loading %s..." % filename)
log.msg("Received unhandled keyID: %r" % (keyID,))

In other words, log messages that take structured information, but then smash it into a not-particularly-distinctive string and spit it out into a line-delimited file.

The Twisted log system has, from its very beginning, supported a better style of formatting, which is:

log.msg(format="twistd %(version)s (%(executable)s %(short)s) starting up.",
        version=copyright.version, executable=sys.executable,
log.msg(format="Loading %(filename)s...", filename=filename)
log.msg(format="Received unhandled keyID: %(keyID)r", keyID=keyID)

There are lots of other features which this style of logging could support, as well, with relatively minor tweaks, such as trivial enhancements like:

def logger(module, audience, importance):
    def logIt(**kw):
        log.msg(module=module, audience=audience, importance=importance, **kw)
    return logIt
info = logger(__name__, ADMIN, NORMAL)
bugreport = logger(__name__, DEVEL, HIGH)
debug = logger(__name__, DEVEL, LOW)
info(format="twistd %(version)s (%(executable)s %(short)s) starting up.",
     version=copyright.version, executable=sys.executable,

This would then allow us to build systems which internally filter messages, avoid expensive string formatting and copying when it's not necessary, tune logging to different levels for different parts of the package hierarchy, and have intelligent real-time log analysis that inspects the objects being logged without having to apply regexes or other static-string-inspection methods to figure out which messages are interesting.  Calendar Server implements some of these features for its logging system, and they've been very useful there.  Various Divmod projects have implemented run-time statistics tracking via that Twisted logging system, and that worked pretty well too.  It would be good to start taking advantage of the good parts of Twisted's logging system within Twisted itself.

Doing this in a consistent way would also open up the possibility of localizing our log messages, which is currently not really plausible.  And also, of logging in a format which lost less information; for example, instead of actually doing textToEventDict, simply dumping a simplified form of the event dict itself to some structured format that could be reliably parsed – something like JSON – for easier parsing and analysis later.

However, in order to be able to take advantage of these features when they're added, we need to start avoiding premature string formatting everywhere that we can today.  If you're writing log messages within twisted (or within a twisted application or library) please adopt the format=, x=, y= style of logging messages so that log observers can actually see into the objects being logged and do interesting things with them.

We should also file some tickets about improving the documentation to indicate this style, for writing a new interface to logging which mandates this style, and for adding layers on top of that to provide more useful functionality like the aforementioned per-module or per-audience filtering.  I have long despaired of filing such tickets because they each seem like boil-the-ocean tasks, changing every single call-site of log.msg (there are more than 500 as of today, just in Twisted itself, and thousands more in other applications I'm sure).

I'm going to try to get started on record my desired behavior for logging as tickets soon though, so Twisted can at least have some really good logging tools eventually.  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20120325/61bc8a48/attachment-0001.htm 

More information about the Twisted-Python mailing list