Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#9280 enhancement closed fixed (fixed)

Eliminate legacy logging in twisted.web

Reported by: Jason Litzinger Owned by: Glyph <glyph@…>
Priority: normal Milestone:
Component: web Keywords:
Cc: Branch:
Author:

Description (last modified by Jason Litzinger)

There are still a few spots where the legacy logging system is used within twisted.web, update it to use twisted.logger.

grep -R 'twisted.python import log' *
_auth/wrapper.py:from twisted.python import log
client.py:from twisted.python import log
distrib.py:from twisted.python import log
html.py:from twisted.python import log
http.py:from twisted.python import log
_newclient.py:from twisted.python import log
server.py:from twisted.python import log, reflect, failure, components
server.py:        from twisted.python import logfile
template.py:from twisted.python import log
test/test_static.py:from twisted.python import log
test/test_newclient.py:from twisted.python import log
xmlrpc.py:from twisted.python import log, reflect, failure

My initial motivation for this ticket was here: https://twistedmatrix.com/pipermail/twisted-python/2017-June/031458.html

Change History (24)

comment:1 Changed 4 years ago by Jason Litzinger

Owner: set to Jason Litzinger

If updating the entire package is too broad I can split this into however many tickets are preferred. I've started working on this, but if there is existing WIP I can close this ticket and (if needed) help elsewhere.

comment:2 Changed 4 years ago by Jason Litzinger

An underlying assumption of this ticket is that the use of log.err from twisted.python.log is "bad". As stated in https://twistedmatrix.com/documents/current/core/howto/logging.html, the legacy methods are clients of the new logger, so if their continued use is ok, then some or all of the files in this ticket may not apply (and the ticket itself may be invalid).

Last edited 4 years ago by Jason Litzinger (previous) (diff)

comment:3 Changed 4 years ago by Jason Litzinger

Description: modified (diff)

comment:4 Changed 4 years ago by Jason Litzinger

Keywords: review added
Owner: Jason Litzinger deleted

comment:5 Changed 4 years ago by Glyph <glyph@…>

Owner: set to Glyph <glyph@…>
Resolution: fixed
Status: newclosed

In 2f47ae5f:

Error: Processor CommitTicketReference failed
 does not appear to be a Git repository. See the log for more information.

comment:6 Changed 4 years ago by Jason Litzinger

After looking into another issue and running against current trunk I discovered that this change has the effect of not showing tracebacks that the legacy system would print for critical log levels. The traceback can still be obtained of course by manipulating the failure instance in the event, but it doesn't happen by default.

The legacy system printing is handled by twisted.python.log.py:textFromEventDict().

If this is a regression please feel free to re-open and I'll work on a fix.

comment:7 Changed 4 years ago by Glyph

Keywords: review removed

Yeah that's a pretty serious regression.

Nominally the correct thing for someone to do here is to actually reverse the change onto trunk until a fixed version has been implemented, as per https://twistedmatrix.com/trac/wiki/ReviewProcess#Revertingachange. If someone could do this I would be much obliged.

comment:8 in reply to:  6 Changed 4 years ago by mark williams

Replying to Jason Litzinger:

After looking into another issue and running against current trunk I discovered that this change has the effect of not showing tracebacks that the legacy system would print for critical log levels. The traceback can still be obtained of course by manipulating the failure instance in the event, but it doesn't happen by default.

Are you referring to the following helpful reproducer you shared for the treq issue?

https://gist.github.com/jlitzingerdev/2b010b15d4772a41601e400c17c5386e

If so, that code doesn't activate the new logging system, so it produces no logging output.

Adding the following causes the code to produce a traceback:

diff --git a/test_fbp_err.py b/test_fbp_err.py
index 8be2249..dc39849 100644
--- a/test_fbp_err.py
+++ b/test_fbp_err.py
@@ -18,6 +18,9 @@ from twisted.internet.ssl import (Certificate, KeyPair, PrivateCertificate,
 from twisted.internet.interfaces import IHostnameResolver, IHostResolution
 from twisted.internet.address import IPv4Address
 
+import sys
+from twisted.logger import globalLogPublisher, textFileLogObserver
+
 rootKey, intermediateKey, serverKey = tuple(
     asymmetric.rsa.generate_private_key(public_exponent=65537,
                                         key_size=2048,
@@ -137,6 +140,8 @@ host = b'https://localhost:8080/'
 #agent = client.Agent(reactor, #contextFactory=cf)
 agent = client.Agent(reactor)
 
+globalLogPublisher.addObserver(textFileLogObserver(sys.stdout))
+
 d = agent.request(
     b'POST',
     host,

I don't think this is a regression. I think it's an indication of how twisted.logger's documentation should be improved.

comment:9 Changed 4 years ago by Jason Litzinger

Mark, That's the one, glad it was helpful. The regression I was concerned about was specifically that, without having to do anything, the legacy system produced a traceback, the modifications to use the new system do not.

On the subject of whether it is a regression, I support whatever the Twisted team decides.

For my two cents -- as a user of twisted (which I am as part of my day job), as long as it was clear I had to activate the new system to get tracebacks I'd be fine, though I might curse for a moment. Regardless, I support and will help however I can. If that support means update the docs as part of this then no problem.

If this is reverted, the question becomes how to handle the migration from legacy to new logging system, specifically, should use of the new system emulate the old by ensuring the traceback is in the print by default? If so, that is technically another option here, though I understand if revert and then make a subsequent change is more desirable for users.

Sorry if this seems somewhat hurried, I won't be able to jump back to this until tonight but wanted to chime in.

comment:10 Changed 4 years ago by Jason Litzinger

To elaborate:

Instead of just (which is all that is shown minus any prints in my gist):

Unexpected exception from twisted.web.client.FileBodyProducer.stopProducing

A user would also see a traceback.

Unexpected exception from twisted.web.client.FileBodyProducer.stopProducing: Traceback (most recent call last):
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/web/_newclient.py", line 975, in dispatcher
    return func(*args, **kwargs)
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/web/_newclient.py", line 1650, in _connectionLost_TRANSMITTING
    self._currentRequest.stopWriting()
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/web/_newclient.py", line 876, in stopWriting
    _callAppFunction(self.bodyProducer.stopProducing)
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/web/_newclient.py", line 197, in _callAppFunction
    f = Failure(e)
--- <exception caught here> ---
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/web/_newclient.py", line 195, in _callAppFunction
    function()
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/web/client.py", line 1087, in stopProducing
    self._task.stop()
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/internet/task.py", line 497, in stop
    self._checkFinish()
  File "/home/jlitzing/dev/twisted/twisted-fork/src/twisted/internet/task.py", line 507, in _checkFinish
    raise self._completionState
twisted.internet.task.TaskStopped:

This isn't a terrible add, a hacked together example (not what I'd submit exactly):

--- a/src/twisted/web/_newclient.py
+++ b/src/twisted/web/_newclient.py
@@ -193,10 +193,12 @@ def _callAppFunction(function):
     """
     try:
         function()
-    except:
+    except Exception as e:
+        f = Failure(e)
         _moduleLog.failure(
-            u"Unexpected exception from {name}",
-            name=fullyQualifiedName(function)
+            u"Unexpected exception from {name}: {text}",
+            name=fullyQualifiedName(function),
+            text=f.getTraceback()
         )
 

comment:11 Changed 4 years ago by Glyph

That is ... definitely not right. If you're using .failure and there's an exception in flight, a traceback should be logged. It is not the code emitting the log message's responsibility to call getTraceback itself.

comment:12 Changed 4 years ago by Jason Litzinger

If you're using .failure and there's an exception in flight, a traceback should be logged. It is not the code emitting the log message's responsibility to call getTraceback itself.

It will, if the new log system is activated. However, the case I raised as a concern is when one does not activate it...before, a traceback was logged, now, one will not be logged. Assuming that behavior is desirable, it will require more than just calling .failure as I did in my changes.

comment:13 Changed 4 years ago by Jason Litzinger

Later this evening I will:

  1. Prepare a branch with this entire series reverted. Worst case nobody uses it, but since I made the mess I'd like to assist cleaning it up.
  1. Investigate the exact mechanisms resulting in the different behavior for the gist above and describe it here. Hopefully this leads to a solution that assists in future transitions from the legacy to the new system within Twisted.
Last edited 4 years ago by Jason Litzinger (previous) (diff)

comment:14 in reply to:  12 Changed 4 years ago by Glyph

Replying to Jason Litzinger:

It will, if the new log system is activated. However, the case I raised as a concern is when one does not activate it...before, a traceback was logged, now, one will not be logged. Assuming that behavior is desirable, it will require more than just calling .failure as I did in my changes.

What does "activated" mean?

comment:15 Changed 4 years ago by Jason Litzinger

"activated" in the context I was using it means explicitly adding an observer as Mark commented above:

globalLogPublisher.addObserver(textFileLogObserver(sys.stdout))

comment:16 in reply to:  15 Changed 4 years ago by Glyph

Replying to Jason Litzinger:

"activated" in the context I was using it means explicitly adding an observer as Mark commented above:

globalLogPublisher.addObserver(textFileLogObserver(sys.stdout))

Aah. Sorry for my lackadaisical reading of the history.

comment:17 Changed 4 years ago by Jason Litzinger

A branch containing the reversion of this enhancement is here: https://github.com/jlitzingerdev/twisted/tree/revert-9280

However, I have not finished reviewing the changes so I wouldn't consider it "ready". The revert was generated by:

git revert -m 1 2f47ae5f

Cheers, will add a better description of why the behavior is different (if nothing more than to close the loop here) later.

comment:18 Changed 4 years ago by Jason Litzinger

What happened?

Summary

twisted.python.log uses helpers under the hood to concatenate the provided log message with the traceback from the accompanying failure before the event is published. From there, the resulting text is stored in the "log_format" key of the emitted event and logged to sys.stderr via one of globalLogPublisher's default observers. The changes for 9280 call twisted.logger.Logger.failure which does not, by itself, ensure a traceback is included in the "log_format" key of the event. Adding one of the many provided observers to the globalLogPublisher will ensure a traceback is printed.

Relevant Legacy Details
  • twisted.logger._legacy.publishToNewObserver is used to publish each logged message.
  • The third parameter of publishToNewObserver is a function that is used to format the message in the "old style", the callable provided is twisted.python.log.textFromEventDict().
  • In this case, textFromEventDict() concatenates the traceback of the accompanying failure to the provided log message.
  • The value returned by textFromEventDict() becomes the "log_format" key in the event dictionary when the event is published.
  • In addition to the LimitedHistoryLogObserver described in the twisted.logger documentation (http://twistedmatrix.com/documents/current/core/howto/logger.html#starting-the-global-log-publisher), a FileLogObserver writing to sys.stderr wrapped in a FilteringLogObserver that logs critical errors.
  • The formatEvent callable is a lambda that calls twisted.logger._format.formatEvent, which in this case formats the "log_format" key of the event dictionary.
  • Because a traceback has already been appended to the "log_format" key, a traceback is printed.
Changes in 9280

The modified code calls the .failure method of a module scope twisted.python.Logger instance but does not do anything to ensure the traceback is concatenated to the provided log message. This results in no traceback being included in the "log_format" key, which is printed to sys.stderr as described above.

Interesting side effects of the existing code (i.e. 17.9.0, before these changes).

comment:19 Changed 4 years ago by Glyph

Thank you very much for your attention to detail! This is an excellent writeup.

I am suspecting more and more that this should not in fact be reverted, and that (despite the regression appearing here with the migrated log messages) the real problem is entirely in twisted.logger, and should be fixed there. Under this definition of "regression", any usage of logger.failure at all would be a "regression" in some sense since it might prevent a traceback from being printed where one was previously.

Furthermore, the new logging system is, I believe, activated by twist and twistd, so we will see the tracebacks in those cases, yes?

comment:20 Changed 4 years ago by Jason Litzinger

Thanks, sorry for all the noise on this.

I am suspecting more and more that this should not in fact be reverted, and that (despite the regression appearing here with the migrated log messages) the real problem is entirely in twisted.logger, and should be fixed there. Under this definition of "regression", any usage of logger.failure at all would be a "regression" in some sense since it might prevent a traceback from being printed where one was previously.

I tend to agree. In the twisted.logger, the concatenation of the message and traceback is done at event presentation (e.g. textFileLogObserver), rather than implicitly as part of event construction. The problem here is that the default presentation mechanism (a lambda calling formatEvent) only logs the log_format key. Fixing this isn't quite so simple because in cases where the legacy system is used a duplicate traceback would appear.

Furthermore, the new logging system is, I believe, activated by twist and twistd, so we will see the tracebacks in those cases, yes?

I want to say yes...but I'd rather be sure. I will work up some proof of this either tonight or, more likely, tomorrow.

Last edited 4 years ago by Jason Litzinger (previous) (diff)

comment:21 Changed 4 years ago by Jason Litzinger

Since twisted.logger and I have gotten to know each other so much better of late, I'll happily see if I can propose a solution that:

  1. Shows tracebacks by default.
  2. Avoids showing duplicate tracebacks.

However, just to restate and validate the requirement -- a user who does not in any way configure the new log subsystem should see tracebacks for critical errors.

correct?

Should I open a new ticket, or track it here?

comment:22 in reply to:  21 Changed 4 years ago by Glyph

Replying to Jason Litzinger:

Since twisted.logger and I have gotten to know each other so much better of late, I'll happily see if I can propose a solution that:

  1. Shows tracebacks by default.
  2. Avoids showing duplicate tracebacks.

However, just to restate and validate the requirement -- a user who does not in any way configure the new log subsystem should see tracebacks for critical errors.

Yep. And I thought we had this, so I look forward to learning about how we screwed it up on the new ticket :).

correct?

Yes.

Should I open a new ticket, or track it here?

New ticket, I think.

comment:23 Changed 4 years ago by Jason Litzinger

Furthermore, the new logging system is, I believe, activated by twist and twistd, so we will see the tracebacks in those cases, yes?

Yes, an observer is added to the globalLogBeginner in both cases. I also hacked (I was lazy) the demo web server to throw an exception when a GET was issued and confirmed a traceback was printed.

I'll file a new ticket and link to it here for the issue discussed above.

comment:24 Changed 4 years ago by Jason Litzinger

New ticket #9333

Note: See TracTickets for help on using tickets.