Opened 15 years ago

Closed 15 years ago

#2643 enhancement closed fixed (fixed)

twisted.python.modules does not account for memory/disk inconsistencies when scanning packages

Reported by: Glyph Owned by: therve
Priority: highest Milestone:
Component: core Keywords:
Cc: therve Branch:
Author:

Description

This script:

from twisted.python.modules import walkModules
import mx.DateTime
print list(walkModules())

results in this error:

Traceback (most recent call last):
  File "fbi.py", line 6, in <module>
    print list(walkModules())
  File "/home/glyph/Projects/Twisted/trunk/twisted/python/modules.py", line 729, in walkModules
    for module in package.walkModules(importPackages=False):
  File "/home/glyph/Projects/Twisted/trunk/twisted/python/modules.py", line 174, in walkModules
    for module in package.walkModules(importPackages=importPackages):
  File "/home/glyph/Projects/Twisted/trunk/twisted/python/modules.py", line 174, in walkModules
    for module in package.walkModules(importPackages=importPackages):
  File "/home/glyph/Projects/Twisted/trunk/twisted/python/modules.py", line 173, in walkModules
    for package in self.iterModules():
  File "/home/glyph/Projects/Twisted/trunk/twisted/python/modules.py", line 125, in iterModules
    for placeToLook in self._packagePaths():
  File "/home/glyph/Projects/Twisted/trunk/twisted/python/modules.py", line 418, in _packagePaths
    for fn in self.load().__path__:
AttributeError: 'module' object has no attribute '__path__'

_packagePaths wrongly assumes that any object which appears to be a package on disk will actually load a package object in memory, which is wrong for a few reasons:

  • multiple sys.path entries can confuse the module loader
  • a module can actually end up being anything it wants from namedAny, so you can't depend on any particular fixed contract
  • even "real" packages might delete their __path__ for some reason

In mxDateTime's case, the module is extremely poorly behaved (it's rife with import * statements, two of which are nested, resulting in a package clobbering its own name), but the general problem is still tpm's fault.

This manifests in a test failure with the above traceback if you have pypgsql and mxDateTime installed and you run the full suite. It doesn't manifest otherwise because tpm never tries to load the mxDateTime packages otherwise.

Change History (10)

comment:1 Changed 15 years ago by therve

Cc: therve added

comment:2 Changed 15 years ago by therve

Keywords: review added
Owner: changed from Glyph to radix
Priority: highhighest

This is maybe fixed in modules-path-2643. At least the reported problem doesn't happen anymore, but I'm not sure I understand this enough.

radix, I'm sure you'd love to review this.

comment:3 Changed 15 years ago by radix

Keywords: review removed
Owner: changed from radix to therve

I guess this is OK.

Really this whole situation is bad. Here's my explanation of the problem: since mx clobbers its own namespace, depending on how or when you import it you might get different objects. Here's how this is affecting t.p.m.

TRUNK:

>>> from twisted.python.modules import getModule
>>> print list(getModule('mx.DateTime.mxDateTime').walkModules())
[PythonModule<'mx.DateTime.mxDateTime'>, ... BIG LIST OF MODULES]
>>> import mx.DateTime.mxDateTime
>>> print list(getModule('mx.DateTime.mxDateTime').walkModules())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "twisted/python/modules.py", line 173, in walkModules
    for package in self.iterModules():
  File "twisted/python/modules.py", line 125, in iterModules
    for placeToLook in self._packagePaths():
  File "twisted/python/modules.py", line 418, in _packagePaths
    for fn in self.load().__path__:
AttributeError: 'module' object has no attribute '__path__'

Your branch::

>>> from twisted.python.modules import getModule
>>> print list(getModule('mx.DateTime.mxDateTime').walkModules())
[PythonModule<'mx.DateTime.mxDateTime'>, ...SAME BIG LIST OF MODULES]
>>> import mx.DateTime.mxDateTime
>>> print list(getModule('mx.DateTime.mxDateTime').walkModules())
[PythonModule<'mx.DateTime.mxDateTime'>]

I think this is acceptable, since walkModules will reflect what is importable at the time that it's called.

+1

comment:4 Changed 15 years ago by therve

Resolution: fixed
Status: newclosed

(In [21452]) Merge modules-path-2643

Author: therve Reviewer: radix Fixes #2643

Manage packages which delete their path attribute in twisted.python.modules.walkModules and twisted.python.modules.iterModules.

comment:5 Changed 15 years ago by therve

(In [21471]) Revert r21452: regression where tests are run in another order

Refs #2643

comment:6 Changed 15 years ago by therve

Resolution: fixed
Status: closedreopened

This was a stupid here in test added, fixed in [21470].

comment:7 Changed 15 years ago by therve

Keywords: review added
Owner: changed from therve to radix
Status: reopenednew

comment:8 Changed 15 years ago by radix

Owner: changed from radix to therve

Sorry for missing that the first time around. Yes, hooray global state. +1.

comment:9 Changed 15 years ago by radix

Keywords: review removed

comment:10 Changed 15 years ago by therve

Resolution: fixed
Status: newclosed

(In [21472]) Merge modules-path-2643

Author: therve Reviewer: radix Fixes #2643

Manage packages which delete their path attribute in twisted.python.modules.walkModules and twisted.python.modules.iterModules.

Note: See TracTickets for help on using tickets.