[Twisted-Python] Windows spawnProcess - Child process inherits files and sockets (POSIX does not)

Oliver Palmer oliverpalmer at opalmer.com
Wed Apr 6 21:57:18 MDT 2016


Owner of pywincffi here, I'd certainly welcome a PR or two for pywincffi
with the necessary functions/constants/etc so you don't have to manage that
code and I'd be happy to help write it too.  I think the consensus is
Twisted is going to eventually replace calls into pywin32 with calls into
pywincffi rather than implement all of that code inside of Twisted itself.
I've already started doing this in a couple of places,
twisted.python.lockfile
<https://github.com/twisted/twisted/compare/trunk...opalmer:windows-cffi>
being the one I'm actively working on because it's simpler to start with.
But the code inside of dumbwin32proc.py and _win32handleinherit.py are both
high on my list to convert too so it probably makes sense that we work on
this together if you're open to it.

Anyway as for some of your questions and points I've tried to reply to some
of them below.  I will admit, it's slanted towards making as much of this a
possible a part of pywincffi having plugged away at the 'what to do about
pywin32' and 'how to make Windows API calls more pythonic' quite a bit now
so keep that in mind.

> After all of this -- including some frustration, I confess -- I decided
to go ahead and create a cffi ABI-mode variation of my original patch,
anyway: it passes the same tests and, much like the ctypes approach, works
nicely on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python
2.7.11 32, Twisted 16.1, cffi 1.5.2.

Looking at your code, some of it could be put into pywincffi already.  It
would need more tests and some additional code so the API calls are closer
to what's already in the project (type checking, default arguments,
documentation, etc) but overall it seems like you've already done the major
work of understanding how it all fits together.  The other advantages of
putting this code into pywincffi is testing and releases are easier because
the project is using AppVeyor to test all PRs and build the wheel files for
most major Python versions including both 32 and 64 bit variants.  From
Twisted's perspective, it's just a dependency on another library.

> Just for kicks I compared the performance of the ctypes vs cffi
implementation: ...

Have you tried a comparison between out-of-line modules and those using
dlopen?  I imagine they'd end up being pretty similar in the end
performance wise but I am a little curious.  In pywincffi I started out
using dlopen but moved away from it because I needed to write some extra
code which couldn't be included from a header.  The other advantage I saw
is that you don't have to rely on the DLL being present and/or Windows
being able to locate it so you can include code which might only
be available if you have some extra library installed.

> This makes sense given that the code is mostly calling out to DLLs and,
AFAICT, cffi does the nice extra work of validating/converting types back
and forth.

It usually does handle type conversion nicely but it can also do odd things
depending on the Python version (like bytes vs. unicode vs. strings) if
you're not careful.  The other issue is user input which can sometimes
result in odd error messages at the C-level that are difficult to
understand or correct without specific knowledge of the underlying code.
In non-public APIs this is not as big of an issue because the code is task
specific but in something like pywincffi or Twisted where people can use
APIs in unexpected ways the automatic conversion/validation can be somewhat
lacking (again, something I wanted to centralize and improve in pywincffi).

> It is mostly based on SysInternals information at
http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html.
There are many other references across the web to those undocumented
options and data structures. But none of those come from formal Microsoft
documents, that I could find.

Some of the people on that site either have contacts within Microsoft or
have worked for Microsoft at one point so I usually trust what's there if
it's the only source.  The other place I often look is the ReactOS project
where they've managed to reverse engineer quite a bit of the Windows kernel
which can either hint at the info you need or validate what you already
know.

> Wrapping up: I'm really not sure how to more forward with this: not only
does pywin32 not provide the needed APIs, but also one of those APIs --
documented -- is being used in an undocumented fashion.

IMHO (again, with some bias), I think implementing the calls you need in
pywincffi is the first step.  If the calls are undocumented it would also
be a good place to do the necessary research, testing and development I
think in isolation from Twisted itself so it's clear we're going in the
right direction.  Once that's done a patch set for Twisted, which calls
into pywincffi, can be opened and tested across the supported platforms.
This makes the patch set smaller but also makes it easier to understand
what if anything the new code breaks.


Regardless, even if you don't want to go the route of putting this into
pywincffi thanks for working on this because it helps in some of the work
I'm doing too.

---Oliver



On Tue, Apr 5, 2016 at 5:11 PM, exvito here <ex.vitorino at gmail.com> wrote:

> On Thu, Mar 31, 2016 at 10:27 PM, Glyph <glyph at twistedmatrix.com> wrote:
>
>> Before submitting a patch for review, I'm looking for preliminary
>> feedback, assuming you agree that the Windows vs POSIX semantics should be
>> the same (if not, why?).
>>
>>
>> After much thought: Yes.  They should be the same.  The reason they're
>> not is largely ignorance of the relevant APIs and abstractions on Windows,
>> not any desire that they differ.  The one place they have to differ a
>> little bit is handle inheritance: we need to figure out some way to express
>> the 'childFDs' mapping in terms of both file descriptors and handles.
>>
>
> Agreed. My code does not go into 'childFDs' mapping territory, though. All
> it does is prevent all file- and socket-handles from being inherited by the
> child process -- other than the pipes used for STDIO communication.
>
> My patch calls a few Windows APIs via ctypes, however, as far as I can
>> tell, Twisted on Windows requires pywin32 and, recently, there has been
>> some discussion around dropping that dependency and moving towards
>> something based on cffi.
>>
>>
>> ctypes is dangerous and error-prone.  If you screw up the bit-width of a
>> type, or the type in a header changes on some future version, ctypes gives
>> you no way of finding out until some poor user's process segfaults, and
>> usually not at the relevant call site.  So we'd prefer not to maintain more
>> ctypes-using code.
>>
>> The APIs in pywin32 very closely mirror the underlying Windows API, so
>> for addressing this issue, please just go ahead and use pywin32 APIs;
>> porting them to a new API along with everything else should be relatively
>> straightforward.
>>
>
>> If we do move forward with that change, we will probably use
>> https://pypi.python.org/pypi/pywincffi and not move anything within
>> Twisted.
>>
>
> Agreed with your ctypes comment -- I've been hit by such faults which
> "magically" went away using cffi when coding against Windows TAPI.
>
> pywin32, unfortunatelly, does not include two Windows APIs (out of four)
> my code requires -- I just grepped the source for latest release I could
> find on SourceForge, 220.
>
> For completeness, the missing APIs are NtQuerySystemInformation [1] and
> NtQueryObject [2].
> The others are GetHandleInformation [3] and SetHandleInformation [4].
>
> What would you say the way forward is? Should I submit the patch for
>> review anyway? Is there any other work that needs to be done first that I
>> may contribute to?
>>
>>
>> Yes, just go ahead and write the patch.
>>
>
> Given that pywin32 does not provide two of the required APIs, maybe this
> issue is somewhat blocked.
>
> Adding to that is the fact that one particular API call in my code --
> NtQuerySystemInformation [1] -- is being used with what seems to be an
> undocumented option -- SystemHandleInformation (enum = 16) -- and
> returning, again, an apparently undocumented data structure --
> SYSTEM_HANDLE_INFORMATION. I downloaded and installed the available SDKs
> and WDKs (driver dev kits) from Microsoft and could not find any reference
> to those particular options or data structures.
>
> My code was created after much investigation on how to obtain the list of
> open handles for the current process.
> The gist of it is:
> - Call NtQuerySystemInformation with the SystemInformationClass arg set
> to SystemHandleInformation.
> - This returns all (!!!) of the handles in the system (no need for special
> privileges).
> - Filter those out by the current process PID and type, such that only
> files and sockets are left.
> - Use the GetHandleInformation to get the inheritance flag and clear it
> with SetHandleInformation if needed.
>
> It is mostly based on SysInternals information at
> http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html.
> There are many other references across the web to those undocumented
> options and data structures. But none of those come from formal Microsoft
> documents, that I could find.
>
> An alternative approach, which I also tried, for completeness sake, was to
> try and Get/SetHandleInformation on all possible handles -- this is
> completely unfeasible given that handles are at least 32 bit numbers.
>
> After all of this -- including some frustration, I confess -- I decided to
> go ahead and create a cffi ABI-mode variation of my original patch, anyway:
> it passes the same tests and, much like the ctypes approach, works nicely
> on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python 2.7.11
> 32, Twisted 16.1, cffi 1.5.2.
>
> Just for kicks I compared the performance of the ctypes vs cffi
> implementation:
> - The ctypes code runs in 0.014s - 0.016s.
> - The cffi code runs in 0.03s - 0.04s.
>
> This makes sense given that the code is mostly calling out to DLLs and,
> AFAICT, cffi does the nice extra work of validating/converting types back
> and forth.
>
>
> Wrapping up: I'm really not sure how to more forward with this: not only
> does pywin32 not provide the needed APIs, but also one of those APIs --
> documented -- is being used in an undocumented fashion.
>
> Even though I'd love to submit a patch, I don't think we're at that point
> yet. However, for posterity's sake and if anyone wants to take a look at
> the code, it is avalable at https://github.com/exvito/twisted in branches
> win32-fix-handle-inherit-cffi and win32-fix-handle-inherit-ctypes. They add
> two tests to twisted/test/test_process.py, one line to
> twisted/internet/_dumbwin32proc.py and one module named
> twisted/internet/_win32handleinherit.py
>
> I'd love to hear feedback or ideas on this.
> Thanks again
>
> [1]
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms724509(v=vs.85).aspx
> [2] https://msdn.microsoft.com/en-us/library/bb432383(v=vs.85).aspx
> [3]
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms724329(v=vs.85).aspx
> [4]
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms724935(v=vs.85).aspx
> --
> exvito
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20160406/f799f13b/attachment-0002.html>


More information about the Twisted-Python mailing list