Opened 3 years ago

Closed 11 months ago

#6749 enhancement closed fixed (fixed)

Port twisted.python.logfile to Python 3

Reported by: multani Owned by: hawkowl
Priority: normal Milestone: Python-3.x
Component: core Keywords:
Cc: Branch: branches/logfilepy3k-6749-3
branch-diff, diff-cov, branch-cov, buildbot
Author: hawkowl, lvh, glyph

Description (last modified by lvh)

This is used by twisted.application.app which in turns is used by twisted.scripts.trial.

Attachments (8)

python3-logfile-6749-1.patch (13.5 KB) - added by multani 3 years ago.
python3-logfile-6749-2.patch (12.7 KB) - added by multani 3 years ago.
python3-logfile-6749-3.patch (13.5 KB) - added by multani 3 years ago.
python3-logfile-6749-3.2.patch (13.5 KB) - added by multani 3 years ago.
python_logfile_py3.patch (1.4 KB) - added by realcr 3 years ago.
Minor syntax changes for python3 porting.
logfilepy3k-6749-4.patch (3.4 KB) - added by multani 2 years ago.
On top of /branches/logfilepy3k-6749 r40528
logfilepy3k-6749-5.patch (3.3 KB) - added by multani 2 years ago.
On top of /branches/logfilepy3k-6749 r40528
logfilepy3k-6749-6.patch (13.0 KB) - added by multani 21 months ago.
Fixes (without the Windows-specific ones) against 24

Download all attachments as: .zip

Change History (39)

comment:1 Changed 3 years ago by multani

I made a first pass available at https://github.com/multani/twisted/tree/python3-logfile-6749

It touches some untested portions of code though, mainly getstate and setstate methods.

I can probably add unit tests for some of those methods to raise the coverage up. Considering #5945, must I also have to add tests for the *state methods?

Changed 3 years ago by multani

Changed 3 years ago by multani

comment:2 Changed 3 years ago by multani

  • Keywords py3k review added

I propose a patch which makes twisted.python.logfile working on Python 2.6 to 3.3 and improves the code coverage to 100%.

There are some things I did which I'm not sure though:

  • I added a test for the default value of DailyLogFile.toDate() which returns today's date. There's a very very tiny chance this test breaks if it happens when the current day changes, but I'm not sure what would be a good way to test this otherwise. This is the XXX in DailyLogFileTestCase.test_toDateDefaultToday().
  • Few of the new tests are doing related to file permissions. I don't have a Windows machine next to me and I'm not sure how they are going to do under Windows.

Regarding my previous comment about testing the __getstate__ and __setstate__ methods, I just added the tests and it looks actually OK to me.

comment:3 follow-up: Changed 3 years ago by itamar

Thanks for the patch!

My initial thoughts, not a full review:

  • Using "/dev/null" won't work on Windows... Once this in a branch someone can run it on our buildbot and we can see what else fails.
  • Using __del__ is usually a bad idea, since it interferes with garbage collection of circular references. It also doesn't add much in terms of functionality, insofar as file objects already have __del__... that's what prints the warning. I would suggest just adding close() calls in the tests.
  • You need to add some __future__ imports: see https://twistedmatrix.com/trac/wiki/Plan/Python3

Changed 3 years ago by multani

comment:4 in reply to: ↑ 3 Changed 3 years ago by multani

Hello itamar,

Replying to itamar:

Thanks for the patch!

My initial thoughts, not a full review:

  • Using "/dev/null" won't work on Windows... Once this in a branch someone can run it on our buildbot and we can see what else fails.

Yeah, I knew this one wasn't working. I'm not sure how to write this test in another way though.

  • Using __del__ is usually a bad idea, since it interferes with garbage collection of circular references. It also doesn't add much in terms of functionality, insofar as file objects already have __del__... that's what prints the warning. I would suggest just adding close() calls in the tests.

OK, so I actually removed the __del__ methods and added close() where it mattered.

Also done.

comment:5 Changed 3 years ago by lvh

  • Owner set to lvh

comment:6 Changed 3 years ago by lvh

  • Author set to lvh
  • Branch set to branches/logfilepy3k-6749

(In [40474]) Branching to 'logfilepy3k-6749'

comment:7 Changed 3 years ago by lvh

  • Keywords review removed

There were a few twistedchecker failures. http://buildbot.twistedmatrix.com/builders/twistedchecker/builds/1453/steps/run-twistedchecker/logs/new%20twistedchecker%20errors

I'll commence fixing them myself and then put it back for review.

comment:8 Changed 3 years ago by lvh

I've made some other upgrades, such as whitespace, docstrings, etc.

There's a test or two that could afford to be like fifteen tests. Maybe some refactoring here and there.

I'm not putting this up for review yet because I'm still figuring out if LogFile.write should open files with an encoding (so you can write unicode), or insist that you write bytes only. I think the answer is the latter, but I'm not super sure :)

Buildbots at: http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/logfilepy3k-6749

comment:9 Changed 3 years ago by exarkun

I'm not putting this up for review yet because I'm still figuring out if LogFile.write should open files with an encoding (so you can write unicode), or insist that you write bytes only. I think the answer is the latter, but I'm not super sure :)

Python 3 porting work is not allowed to change APIs unless it is impossible to avoid.

If you want to add unicode support to the logging system then take a look at #989. :)

comment:10 Changed 3 years ago by lvh

Right. I've opened a mailing list thread about this. I think the conflict stems down to log.msg taking native strings always (so bytestrings in py2 and unicode strings in py3), which means that LogFile.write should take native strings always. However, the Python3 porting plan suggests that "files are always opened in binary mode" should be a review point.

It appears to me that the obvious solution here is that the files should not be opened in binary mode.

comment:11 Changed 3 years ago by lvh

  • Description modified (diff)
  • Owner lvh deleted

Pursuant to recommendation in mailing list thread, I removed the binary modes from the tests.

There is still a test failure related to the use of /dev/null

Also the win7 builder fails with the well-known:

IOError: [Errno 28] No space left on device

There appear to be some more test failures that *are* actual bugs, but I don't know how to fix them or why they break on Windows only:

http://buildbot.twistedmatrix.com/builders/windows7-64-py2.7/builds/3286/steps/iocp/logs/stdio

comment:12 Changed 3 years ago by lvh

I've specialcased the /dev/null case for windows to be NUL. The windows7-64-py2.7-msi buildbot appears happy, but I obviously want extra review for this bit pretty please :)

comment:13 follow-up: Changed 3 years ago by lvh

  • Keywords review added

Please note: there are still two buildbot failures that I simply don't understand, but I think there's enough here to go through review; hopefully someone with more buildbot+windows mojo can tell me what's going on here :)

comment:14 Changed 3 years ago by exarkun

You might want to use os.devnull instead.

comment:15 Changed 3 years ago by lvh

Considered it, but LogFile wants the directory and filename as separate things.

comment:16 in reply to: ↑ 13 Changed 3 years ago by multani

Replying to lvh:

Please note: there are still two buildbot failures that I simply don't understand, but I think there's enough here to go through review; hopefully someone with more buildbot+windows mojo can tell me what's going on here :)

For twisted.test.test_logfile.DailyLogFileTestCase.test_getLog, I fixed it using this:

diff --git twisted/test/test_logfile.py twisted/test/test_logfile.py
index f782b6b..5e64fbf 100644
--- twisted/test/test_logfile.py
+++ twisted/test/test_logfile.py
@@ -451,6 +451,7 @@ class DailyLogFileTestCase(unittest.TestCase):
         log = RiggedDailyLogFile(self.name, self.dir)
         for d in data:
             log.write(d)
+        log.flush()

         # This returns the current log file.
         r = log.getLog(0.0)
@@ -460,11 +461,10 @@ class DailyLogFileTestCase(unittest.TestCase):
         self.assertRaises(ValueError, log.getLog, 86400)

         log._clock = 86401 # New day
-        log.rotate()
         r.close()
+        log.rotate()
         r = log.getLog(0) # We get the previous log
         self.assertEqual(data, r.readLines())
-        log.close()
         r.close()
  • I added the call to  .flush() to get the content of the file
  • I had to close the LogReader before rotating the file, otherwise the reader keeps the logfile opened and Windows complains it can't move an opened file.

As for twisted.test.test_logfile.DailyLogFileTestCase.test_rotatePermissionDirectoryNotOk, I believe it fails because the test fails to set the "read-only" flag on the log directory. Hence, the very first test in .rotate() don't pass (it detects that the log directory is writable) and it then rotates the file, which fails the test. I'm not sure how to make a directory non-writable on Windows such as os.access(directory, os.W_OK) returns False, or if the test should be changed somehow.

comment:17 Changed 3 years ago by realcr

Some minor syntax modifications for python3 porting:

  • using int(...,8) for octal base.
  • Using braces for raise statements.

Patch is included.

Changed 3 years ago by realcr

Minor syntax changes for python3 porting.

comment:18 follow-up: Changed 2 years ago by rwall

  • Keywords review removed
  • Owner set to lvh

Thanks lvh and multani,

This all looks pretty good to me.

Notes:

  1. I ignored attachment:python_logfile_py3.patch from realcr because it doesn't seem to be against the branch for this ticket -- maybe realcr didn't realise there was an existing branch.
  2. Forced a new build (the old build results had disappeared): https://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/logfilepy3k-6749
  3. Merges cleanly
  4. I went through each point in https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist

Points:

  1. Missing newsfile: https://twistedmatrix.com/trac/wiki/ReviewProcess#Newsfiles
  2. source:branches/logfilepy3k-6749/twisted/python/logfile.py
    1. Coverage: There are a couple of branches which are not fully covered. Consider adding tests for those.
      1. https://buildbot.twistedmatrix.com/builds/twisted-coverage.py/twisted-coverage.py-r40528/twisted_python_logfile.html
    2. DailyLogFile.suffix converts date tuple integers to string using str -- is that ok? Maybe it should be bytes if it's to be used in a filename...then again modern systems allow unicode filenames, right? So perhaps these lines should be changed to map(unicode...)
      1. https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist (point5)
    3. Bytes for filenames in general -- There are quite a few filename manipulations such as 'newpath = "%s.%s" % (self.path, self.suffix(self.lastDate))' which I'd have thought ought to be given a b"" or u"" prefix. Again, I'm not sure
    4. "All calls to open() files should be in binary mode"
      1. I see that lvh commented on this in comment:11
      2. So maybe it's worth adding a comment with a link to the mailing list thread: https://twistedmatrix.com/pipermail/twisted-python/2013-October/027650.html
      3. Also maybe it's worth adding a comment linking to the proposed change to allow unicode log.msg in comment:9 and #989.
  3. source:branches/logfilepy3k-6749/twisted/test/test_logfile.py
    1. "from future import division, absolute_import" is also required in test modules I think.
      1. https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist (6)
    2. Lots of non-binary open() calls -- "All calls to open() files should be in binary mode."
      1. https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist (8)
      2. But maybe the same arguments against binary open() in logfile.py apply here too.
  4. Failing windows tests
    1. https://buildbot.twistedmatrix.com/builders/windows7-64-py2.7/builds/3639/steps/select/logs/problems
    2. I didn't make the change proposed by multani in comment:16. I'll leave that to lvh.
    3. The call to flush() in the test makes sense, I guess,
    4. But having to explicitly close the log file in the test, before rotating it seems wrong to me. Why didn't this failure show up before this branch? And if the close() is required, then it seems to me that it should be done implicitly by log.rotate() method...maybe only on windows. If you agree, then open a ticket about that and link to it from a comment in the code.

Ok that's all. Please resubmit for a final review after answering or addressing the numbered points above and linking to some clean build results. I'll assign this back to lvh, because he has commit access and permission to start builds, but multani may feel like attaching a full patch for the windows test failures.

-RichardW.

comment:19 Changed 2 years ago by multani

As for the type of the paths, I'm also not sure what to do. I was going to go for the "native string" type but I had look at other parts of Twisted, like twisted.python.filepath and it handles everything as byte string instead.

Any opinion on how it should be done?

comment:20 Changed 2 years ago by multani

Thanks rwall for the review.

Replying to rwall:

Thanks lvh and multani,

This all looks pretty good to me.

Points:

  1. Missing newsfile: https://twistedmatrix.com/trac/wiki/ReviewProcess#Newsfiles

Done.

  1. source:branches/logfilepy3k-6749/twisted/python/logfile.py
    1. Coverage: There are a couple of branches which are not fully covered. Consider adding tests for those.
      1. https://buildbot.twistedmatrix.com/builds/twisted-coverage.py/twisted-coverage.py-r40528/twisted_python_logfile.html

Done.

  1. DailyLogFile.suffix converts date tuple integers to string using str -- is that ok? Maybe it should be bytes if it's to be used in a filename...then again modern systems allow unicode filenames, right? So perhaps these lines should be changed to map(unicode...)
    1. https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist (point5)
  2. Bytes for filenames in general -- There are quite a few filename manipulations such as 'newpath = "%s.%s" % (self.path, self.suffix(self.lastDate))' which I'd have thought ought to be given a b"" or u"" prefix. Again, I'm not sure

So as I said earlier, I'm not sure what to do with those comments. I would personally use native strings, but trying to figure out how other parts of Twisted were doing, I stumbled upon twisted.python.filepath which explicitly ask to use byte-strings for file paths.

One thing which logfile does and filepath doesn't is building up new path from an existing path (for example to compute the name of a rotated log file with a ".1" suffix). Currently, Python 3 doesn't support formatting byte strings, which means we either have to:

  • compute the new path using a non-byte string, which means will have to decode the byte-string which represents the path and then re-encode the whole thing to bytes:
    >>> path = b"/tmp/foo"
    >>> "%s.1" % path
    "b'/tmp/foo'.1"
    >>> b"%s.1" % path
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes'
    >>> b"{0}.1".format(path)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'bytes' object has no attribute 'format'
    
    The question is then: which encoding?
  • or we compute new paths using +, which looks ... non-optimal at best.
  1. "All calls to open() files should be in binary mode"
    1. I see that lvh commented on this in comment:11
    2. So maybe it's worth adding a comment with a link to the mailing list thread: https://twistedmatrix.com/pipermail/twisted-python/2013-October/027650.html
    3. Also maybe it's worth adding a comment linking to the proposed change to allow unicode log.msg in comment:9 and #989.

I added a comment explaining the reasoning behind the non-binary-mode opening, which I hopefully correctly understood.

  1. source:branches/logfilepy3k-6749/twisted/test/test_logfile.py
    1. "from future import division, absolute_import" is also required in test modules I think.
      1. https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist (6)

Done

  1. Lots of non-binary open() calls -- "All calls to open() files should be in binary mode."
    1. https://twistedmatrix.com/trac/wiki/Plan/Python3#Reviewerchecklist (8)
    2. But maybe the same arguments against binary open() in logfile.py apply here too.

I think the same reasoning for logfile.py applies also here. As long as #989 is not fixed, I'm not sure it's really interesting to use the binary mode. It's just going to add more burden to the tests.

  1. Failing windows tests
    1. https://buildbot.twistedmatrix.com/builders/windows7-64-py2.7/builds/3639/steps/select/logs/problems
    2. I didn't make the change proposed by multani in comment:16. I'll leave that to lvh.
    3. The call to flush() in the test makes sense, I guess,
    4. But having to explicitly close the log file in the test, before rotating it seems wrong to me. Why didn't this failure show up before this branch? And if the close() is required, then it seems to me that it should be done implicitly by log.rotate() method...maybe only on windows. If you agree, then open a ticket about that and link to it from a comment in the code.

I haven't look at Windows yet. Need to setup everything.

Ok that's all. Please resubmit for a final review after answering or addressing the numbered points above and linking to some clean build results. I'll assign this back to lvh, because he has commit access and permission to start builds, but multani may feel like attaching a full patch for the windows test failures.

-RichardW.

I pushed the modification over lvh branch logfilepy3k-6749-2 on Github at https://github.com/multani/twisted/commits/logfilepy3k-6749-2 I'll push here an updated patch over Subversion's /branches/logfilepy3k-6749 branch with the stuff I fixed.

Changed 2 years ago by multani

On top of /branches/logfilepy3k-6749 r40528

comment:21 in reply to: ↑ 18 Changed 2 years ago by multani

Replying to rwall:

  1. Failing windows tests
    1. https://buildbot.twistedmatrix.com/builders/windows7-64-py2.7/builds/3639/steps/select/logs/problems
    2. I didn't make the change proposed by multani in comment:16. I'll leave that to lvh.
    3. The call to flush() in the test makes sense, I guess,
    4. But having to explicitly close the log file in the test, before rotating it seems wrong to me. Why didn't this failure show up before this branch? And if the close() is required, then it seems to me that it should be done implicitly by log.rotate() method...maybe only on windows. If you agree, then open a ticket about that and link to it from a comment in the code.

Two things here:

  • Why didn't this failure show up before this branch?: it's a new test that I added to raise the code coverage :)
  • the close() is required and OK I think because we are dealing with 2 things here:
    • the LogReader, which is created from the DailyLogFile object, and which is the one which hold the opened file
    • the DailyLogFile which does the rotation

So, unless the DailyLogFile keeps track of the LogReader opened by itself, which sounds like an awful idea, I don't think log.rotate() can do anything with the still-opened file. So in my opinion, the change I propose in comment:16 (about r.close()) makes sense

As for the second test test_rotatePermissionDirectoryNotOk, I have no idea how to set a directory as "read-only" on Windows. It seems we need to play with NTFS security attributes, since I can manage to do it from the Folder Properties GUI, but my knowledge using the Win32 API is beyond zero. We might be able to use SetSecurityInfo to with the right set of parameters, but I'm not sure I'm interested to dive into that actually...

It would also mean that the test would have some kind of condition depending on the plateform: set read-only on *nix with os.chmod(0o444) and do the Win32 Magic Dance to do the same on Windows.

Ok that's all. Please resubmit for a final review after answering or addressing the numbered points above and linking to some clean build results. I'll assign this back to lvh, because he has commit access and permission to start builds, but multani may feel like attaching a full patch for the windows test failures.

I'll resubmit a patch and push the fix for the getLog test on the branch I mentionned above on Github.

Changed 2 years ago by multani

On top of /branches/logfilepy3k-6749 r40528

comment:22 Changed 2 years ago by multani

  • Keywords review added; py3k removed

This should have had the review keyword I think...

comment:23 Changed 22 months ago by glyph

  • Author changed from lvh to lvh, glyph
  • Branch changed from branches/logfilepy3k-6749 to branches/logfilepy3k-6749-2

(In [43405]) Branching to logfilepy3k-6749-2.

comment:24 follow-ups: Changed 21 months ago by glyph

  • Keywords review removed
  • Owner changed from lvh to multani

Hi multani,

Thanks for your contribution.

I merged forward and re-ran the tests.

  1. The Windows tests are still failing in much the same way that they were before.
  2. There are a couple of twistedchecker errors that need fixing.
  3. The .close calls should probably be added with addCleanup, rather than done inline in the test. If the test fails, we don't want to leave the files open.
  4. The "We don't open files in binary mode since…" docstring is somewhat confusing.
    1. The encoding would not necessarily have to be configurable; it could easily be hard-coded to be UTF-8, or maybe even just latin-1.
    2. #989 was partially mitigated, so it's important to make sure that we don't regress, and don't raise errors when unicode is logged. (I don't think it was adequately tested when it was mitigated; if the existing tests already verify that, then nevermind.)
    3. It should use proper epytext markup to link to tickets; U{} with URLs rather than #ticketnumber and bare URLs for the mailing list.
  5. The "XXX" in the comment in test_toDateDefaultToday doesn't add any information; you can just leave the comment as is. Better yet would be to use something like self.patch to replace the today used by the code under test, so that we don't have this small non-determinism it describes.

comment:25 in reply to: ↑ 24 ; follow-up: Changed 21 months ago by multani

Hi Glyph,

thank you for the review!

Replying to glyph:

Hi multani,

Thanks for your contribution.

I merged forward and re-ran the tests.

  1. The Windows tests are still failing in much the same way that they were before.

I'm currently stuck on this.

I see 2 problems:

  1. I don't know how to set a directory as "Read-only" on Windows. I can do it manually using the GUI and have a write inside this directory to fail, but I don't know how to do this programmatically. I posted a question on Stackoverflow to request some help.
  2. This test actually exercises the 2 first lines of LogFile.rotate(), which start like this:
        if not (os.access(self.directory, os.W_OK) and
                os.access(self.path, os.W_OK)):
            return
    

This is never going to work on Windows, since os.access() only check "basic" permission semantics (see the 2nd note), and fails to check the security attributes, which are the only way AFAIK to enforce read-only operations on a directory. A simple example shows the problem:

import os

d = "foo"
f = os.path.join(d, "bar")

print "Access to %r: %s" % (d, os.access(d, os.W_OK))
print "Access to %r: %s" % (f, os.access(d, os.W_OK))

with open(f, "w") as fp:
    fp.write("yop")
print "Wrote"

prints:

> python test.py
Access to 'foo': True
Access to 'foo\\bar': True
Traceback (most recent call last):
  File "test.py", line 9, in <module>
    with open(f, "w") as fp:
IOError: [Errno 13] Permission denied: 'foo\\bar'

This can be replaced by trying to open a (temporary?) file for writing in this directory and then deleting it: if it succeeds, we may pursue the operation, otherwise, just return from LogFile.rotate(). This might be sufficient in this case (there is a race condition, but it existed before, and the whole function should be rewritten so that it doesn't back off in the middle of a os.rename() because permissions on the directory changed).

So, I'll need two things to continue on this ticket:

  • find a way to test this read-only directory on Windows
  • fix the usage of os.access() by something which works on Windows.

All of this is not the subject of this patch though.

comment:26 in reply to: ↑ 25 ; follow-up: Changed 21 months ago by glyph

Replying to multani:

Hi Glyph,

thank you for the review!

Replying to glyph:

Hi multani,

Thanks for your contribution.

I merged forward and re-ran the tests.

  1. The Windows tests are still failing in much the same way that they were before.

I'm currently stuck on this.

I see 2 problems:

  1. I don't know how to set a directory as "Read-only" on Windows. I can do it manually using the GUI and have a write inside this directory to fail, but I don't know how to do this programmatically. I posted a question on Stackoverflow to request some help.

I think this SO question has two possible answers for you:

https://stackoverflow.com/questions/4829043/how-to-remove-read-only-attrib-directory-with-python-in-windows

to wit,

  1. os.system("attrib +R ...")
  2. win32api.SetFileAttributes(..., win32con.FILE_ATTRIBUTE_READONLY)
  1. This test actually exercises the 2 first lines of LogFile.rotate(), which start like this:
        if not (os.access(self.directory, os.W_OK) and
                os.access(self.path, os.W_OK)):
            return
    

This is never going to work on Windows, since os.access() only check "basic" permission semantics (see the 2nd note), and fails to check the security attributes, which are the only way AFAIK to enforce read-only operations on a directory.

This can be replaced by trying to open a (temporary?) file for writing in this directory and then deleting it: if it succeeds, we may pursue the operation, otherwise, just return from LogFile.rotate(). This might be sufficient in this case (there is a race condition, but it existed before, and the whole function should be rewritten so that it doesn't back off in the middle of a os.rename() because permissions on the directory changed).

Yes, calling os.access is almost always the Look Before You Leap anti-pattern. So if this is to be fixed, let's fix it for real and properly handle exceptions from the underlying file I/O.

So, I'll need two things to continue on this ticket:

  • find a way to test this read-only directory on Windows
  • fix the usage of os.access() by something which works on Windows.

All of this is not the subject of this patch though.

Right now, we have tests which are already failing on Windows, so there's no need to add more coverage. Tests for functionality that was not previously tested at all can be skipped in a platform specific way - after all:

  1. we don't support Python 3 on Windows (there is no buildbot for it) and
  2. test coverage will have improved, overall.

I think that for this ticket it would be acceptable to simply skip testing these edge-cases while porting to python 3, and file a separate ticket to add coverage using attrib or SetFileAttributes.

Does this sound like a reasonable way to proceed to you?

comment:27 in reply to: ↑ 24 Changed 21 months ago by multani

Replying to glyph:

Hi multani,

Thanks for your contribution.

I merged forward and re-ran the tests.

  1. The Windows tests [http://buildbot.twistedmatrix.com/builders/twistedchecker/builds/2365/steps/run-twistedchecker/logs/new%20twistedchecker%20errors are still failing] in much the same way that they were before.

I'll address this in the next comment, in response to 26.

2. [http://buildbot.twistedmatrix.com/builders/twistedchecker/builds/2365/steps/run-twistedchecker/logs/new%20twistedchecker%20errors There are a couple of twistedchecker errors] that need fixing.

I believed this is fixed.

  1. The .close calls should probably be added with addCleanup, rather than done inline in the test. If the test fails, we don't want to leave the files open.

I replaced most of the log.close() called by a self.addCleanup(log.close) instead, those I added and those already existing. There are still a few log.close() directly called in the code, which are needed for the assertions which are following (these tests *have* to close the log before checking things.)

  1. The "We don't open files in binary mode since…" docstring is somewhat confusing.
    1. The encoding would not necessarily have to be configurable; it could easily be hard-coded to be UTF-8, or maybe even just latin-1.
    2. #989 was partially mitigated, so it's important to make sure that we don't regress, and don't raise errors when unicode is logged. (I don't think it was adequately tested when it was mitigated; if the existing tests already verify that, then nevermind.)

tl;dr: I propose to open the files in non-binary mode with a UTF-8 encoding. This should not break the regular API for Python 2 and allow to write Unicode log messages with Python 3.


So, I read #989 and AFAIK the only related commit which got into trunk was [13877] which doesn't mitigate much. 15793] (from #1069) allows to log Unicode (with non-ASCII content) strings, although it will probably only log the encode error and not the actual content of the message actually.

But anyway, as far as I understand (feel free to correct me if I'm wrong), t.p.logfile is to be used through and serves as a backend module to t.p.log. My feeling is: it deals more with writing stuff to the disk (stuff being the easiest thing to write, aka. bytes) and rotating and listing log files, this is the job of this module. Receiving the end-user messages, trying to understand them and sending them to the backends is more the role of t.p.log. I may really be wrong though, and we still have to comply with Twisted compatibility policy. Hence the following (and my personal conclusion).

In Python 2: t.p.log itself seems to accept any kind of string as long as it's "ASCII compatible" and sends this to t.p.logfile. If the original message is not "ASCII compatible", t.p.log converts the message into an error and sends this error to t.p.logfile. So in the end, t.p.logfile only gets strings which can be written in a file opened without specifying an encoding (which seems to be ascii in Python 2).

In Python 3: t.p.log happily accepts Unicode messages with non-ASCII characters and sends this to t.p.logfile as Unicode strings. bytes strings given to t.p.log will be converted to Unicode (through repr()) and send the same way as Unicode strings. In this case, t.p.logfile then needs to encode these strings before writing them into the file, by default the encoding depends on the current locale set.

So, considering the above, I would propose to open the log file with a specific encoding. exarkun in https://twistedmatrix.com/pipermail/twisted-python/2013-October/027651.html proposed UTF-8, that seems a good idea to me:

  • it's a superset of ASCII, so it should be compatible with the current behavior when using t.p.log with Python 2. It will have the side effect of allowing to write Unicode strings directly into t.p.logfile; this somehow breaks the API (this was raising an error before) but not that much (people doing this would actually *avoid* doing this since it wasn't working.)
  • it will allow to use Python 3's native string at is full potential. I know that's not a goal, but I don't see a particular reason to limit this potential because it was "incompletely" implemented before (for whatever it meant in Python 2).

At the moment, there are no tests which send non-ASCII data to t.p.logfile. I can certainly add some; this comes to my mind, with the current status (using a simple open() call):

    log.write( "woo\n")
    log.write(u"woo\n")
    log.write(b"woo\n") # TypeError with Python 3

    log.write( "éèà\n")
    log.write(u"éèà\n") # UnicodeEncodeError(to "ascii") with Python 2

Switching to the previous proposition (open non-binary + UTF-8 encoding) gives this:

    log.write( "woo\n")
    log.write(u"woo\n")
    log.write(b"woo\n") # TypeError with Python 3

    log.write( "éèà\n") # UnicodeDecodeError(from "ascii") with Python 2
    log.write(u"éèà\n")

On Python 2, using binary mode instead gives the same result as the 1st example. On Python 3, it breaks for the very first case.


In the end, I'm not really sure what's the best to do. Ideally, I'd like to provide a good balance between taking care of the compatibility while giving "full power" to Python 3, but that's awkward to do and will result in different behavior between the 2 versions anyway.

I haven't done anything in the patch about this. I would really appreciate some advice on how I should proceed next.

  1. It should use proper epytext markup to link to tickets; U{} with URLs rather than #ticketnumber and bare URLs for the mailing list.

That, has been done.

  1. The "XXX" in the comment in test_toDateDefaultToday doesn't add any information; you can just leave the comment as is. Better yet would be to use something like self.patch to replace the today used by the code under test, so that we don't have this small non-determinism it describes.

I mocked the call to time.localtime (I wasn't aware of this particular .patch feature of Trial), removed the comment you referred to and explained why we are actually mocking this call. While doing this, I noticed another case which wasn't covered by the tests (passing arguments to .toDate) so I added another test case to further enhance the coverage.

Changed 21 months ago by multani

Fixes (without the Windows-specific ones) against 24

comment:28 in reply to: ↑ 26 Changed 21 months ago by multani

Replying to glyph:

  1. I don't know how to set a directory as "Read-only" on Windows. I can do it manually using the GUI and have a write inside this directory to fail, but I don't know how to do this programmatically. I posted a question on Stackoverflow to request some help.

I think this SO question has two possible answers for you:

https://stackoverflow.com/questions/4829043/how-to-remove-read-only-attrib-directory-with-python-in-windows

to wit,

  1. os.system("attrib +R ...")
  2. win32api.SetFileAttributes(..., win32con.FILE_ATTRIBUTE_READONLY)

Sadly, this only works on files, not directories (see `FILE_ATTRIBUTE_READONLY` on ''SetFileAttributes function (MSDN)''). That would not be sufficient to support the test on Windows.

[...] This is never going to work on Windows, since os.access() only check "basic" permission semantics (see the 2nd note), and fails to check the security attributes, which are the only way AFAIK to enforce read-only operations on a directory.

[...]

Yes, calling os.access is almost always the Look Before You Leap anti-pattern. So if this is to be fixed, let's fix it for real and properly handle exceptions from the underlying file I/O.

Although you are perfectly right, I was especially talking about os.access() which doesn't fully work on Windows. As I said before, there's, I'm quite sure, no way to be sure if you can write into a directory with os.access().

So, I'll need two things to continue on this ticket:

  • find a way to test this read-only directory on Windows
  • fix the usage of os.access() by something which works on Windows.

All of this is not the subject of this patch though.

Right now, we have tests which are already failing on Windows, so there's no need to add more coverage. Tests for functionality that was not previously tested at all can be skipped in a platform specific way - after all:

  1. we don't support Python 3 on Windows (there is no buildbot for it) and
  2. test coverage will have improved, overall.

I think that for this ticket it would be acceptable to simply skip testing these edge-cases while porting to python 3, and file a separate ticket to add coverage using attrib or SetFileAttributes.

Does this sound like a reasonable way to proceed to you?

Please note that I got half of an answer (and also something quite similar on my side, we both miss the "turn this back how it was before changing the permissions" thing) on my [Stackoverflow question. I, personally, don't feel it's a reasonable way to proceed: that's way too much code only for a one line test, and it's not even complete (as I said before, we should get the Windows-specific code to change the permissions back on how they were before the test.)

Anyway, if skipping this test on Windows only is an acceptable fix for this ticket, then that's a reasonable way to proceed to me :)

comment:29 Changed 12 months ago by _sunu_

  • Keywords review added
  • Owner multani deleted

Adding the review keyword. Please feel free to assign it back to me after review so that I can work on it if any more changes are needed.

comment:30 Changed 11 months ago by hawkowl

  • Author changed from lvh, glyph to hawkowl, lvh, glyph
  • Branch changed from branches/logfilepy3k-6749-2 to branches/logfilepy3k-6749-3

(In [45587]) Branching to logfilepy3k-6749-3.

comment:31 Changed 11 months ago by hawkowl

  • Keywords review removed
  • Owner set to hawkowl

I like this patch. I re-enabled some tests that relied on it and fixed some tiny pyflakes errors, and added a standard topfile.

Will merge.

comment:32 Changed 11 months ago by hawkowl

  • Resolution set to fixed
  • Status changed from new to closed

(In [45598]) Merge logfilepy3k-6749-3: Port twisted.python.logfile to Python 3

Author: multani Reviewers: itamar, lvh, rwall, glyph, hawkowl Fixes: #6749

Note: See TracTickets for help on using tickets.