[Twisted-Python] logging

Christopher Armstrong radix at twistedmatrix.com
Fri Jun 14 23:48:42 MDT 2013


On Fri, Jun 14, 2013 at 6:40 PM, Glyph <glyph at twistedmatrix.com> wrote:
>
> On Jun 14, 2013, at 3:43 PM, Glyph <glyph at twistedmatrix.com> wrote:
>
> I think that we can satisfy all of these requirements by expanding the
> notion of an "identifier" a bit.
>
>
> In our last episode, I mentioned that an identifier should have several
> properties:
>
> human readable (and potentially non-unique) for ease of writing messages and
> reading them to get a general sense of the behavior of a system
> globally unique (and potentially hard to read) for ease of filtering and
> monitoring
> long-term (and potentially unique to humans) for ease of writing tools to
> deal with particular messages, whose exact details may change over time
>
>
> I think we can achieve all of these things by making there be multiple ways
> to identify a log message.  Here's how that could work:
>
> When you are first writing a log message, just do this, as you usually
> would; toss off some crummy english message:
>
> log.info("hey check it out I just {verb}ed a {noun}", verb=request.method,
> noun=request.uri)
>
>
> Asking the author to write a log message with more detail and care than this
> is basically just a recipe for disappointment :).
>
> Later, someone might want to instrument these messages.  We've all agreed
> that re.findall("hey check it out.*") is a crummy way to do this.  The
> obvious thing would be something like this:
>
> tapEventsLike(myObserver, publisher, log_format="hey check it out I just
> {verb}ed a {noun}")
>
>
> That's fine to start with, of course, but it's not very resilient to changes
> in the code; the developer might later change this line to read:
>
> log.info("hey check it out I just {verb}ed a {noun} and got a {result}",
> verb=request.method, noun=request.uri, result=request.response.status)
>
>
> Now the log tap is broken, and there's no reasonable way to fix it; the
> format string doesn't have a place that it could go.
>
> But, the 'log' in the above example is a logger object associated with a
> class or module, and so has 3 persistent pieces of information it can use to
> build a long-term identifier that the event tap can use.  We could reflect
> that in an API like this:
>
> from example.module import log
> tapEventsLike(myObserver, publisher,
>               log_id=log.idForOriginalFormat("hey check it out I just
> {verb}ed a {noun}"))
>
>
> which could be computed offline, and then specified like this, to be even
> more resilient to API changes:
>
> tapEventsLike(myObserver, publisher,
> log_id='604e45b5-2d41-5ba5-9ae1-7205163c3e3f')
>
>
> That string, of course, is computed like this:
>
>>>> import uuid
>>>> from urllib import urlencode
>>>> uri = "http://twistedmatrix.com/ns/log/?" + urlencode(dict(format="hey
>>>> check it out I just {verb}ed a {noun}", module="example.module"))
>>>> uuid.uuid5(uuid.NAMESPACE_URL, uri)
> UUID('604e45b5-2d41-5ba5-9ae1-7205163c3e3f')
>
>
> (Although if we decide we like this system, we could of course go with a
> helpful command-line tool to generate these for you.)
>
> As you evolve your codebase to take logging more seriously, you could change
> the original 'log.info' call to instead look like this:
>
> verbANoun = log.declare(
>
>     "hey check it out I just {verb}ed a {noun} and got a {result}",
>
>     log_id='604e45b5-2d41-5ba5-9ae1-7205163c3e3f',
>
> )
>
> # ...
>
> verbANoun(verb=request.method, noun=request.uri,
> result=request.response.status)
>
>
> and then monitoring can be accomplished via public API:
>
> from example.module import verbANoun
> verbANoun.tap(myObserver, theLogPublisher)
>
>
> So, after some evolution, our message has 3 identifiers:
>
> the original format string, which was easy to come up with on the spot, and
> need not be unique
> a UUID, that was automatically generated but can be carried around to allow
> for cleaning up the codebase and the message without breaking any code doing
> monitoring
> an API name that the logger and the logee can use to agree on the semantics
> of a particular message
>
>
> Best of all, you can get this even for the plethora of existing log messages
> within Twisted that aren't even using structured values, as we evolve them
> to have more metadata.  You also now have the ability to localize the format
> string, if you want to start presenting these log messages in contexts other
> than log files for system administrators.
>
> Thoughts?

There's... a lot here. I'm overwhelmed by how complex this system
would be, and I think that end users would be as well. I don't really
want to put UUIDs into my source code, whether I type them or paste
them, and I don't want to have to invoke command line tools to figure
out what those UUIDs in order to be a good developer.

I think there's basically no practical benefit over the hierarchical
"system" + event-identifier system, where:

1. it's trivial to specify a hierarchical, easy-to-read "system" key
that has a small-ish scope
2. you think of a unique event name in that small-ish scope and type it out.

I'm beginning to think #1 is the most important of all, but I think
I'll continue to use event-names to describe all my log statements.

As I said in my response to Wilfredo, I think it's really important to
stop specifying the "system" for application code in Twisted. The
application's code should not be considered a part of the HTTP
protocol's system. There are also other problems with the way we use
system, like putting extra random data like request ID in there.

--
Christopher Armstrong
http://radix.twistedmatrix.com/
http://planet-if.com/




More information about the Twisted-Python mailing list