Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate Stdlib Extension Modules #103092

Open
3 of 26 tasks
ericsnowcurrently opened this issue Mar 28, 2023 · 19 comments
Open
3 of 26 tasks

Isolate Stdlib Extension Modules #103092

ericsnowcurrently opened this issue Mar 28, 2023 · 19 comments
Labels
3.12 new features, bugs and security fixes extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Mar 28, 2023

See PEP 687.

Currently most stdlib extension have been ported to multi-phase init. There are still a number of them to be ported, almost entirely non-builtin modules. Also, some that have already been ported still have global state that needs to be fixed.

(This is part of the effort to finish isolating multiple interpreters from each other. See gh-100227.)

High-Level Info

How to isolate modules: https://docs.python.org/3/howto/isolating-extensions.html (AKA PEP 630).

The full list of modules that need porting can be found with: ...

The full list of remaining (unsupported) global variables is:

A full analysis of the modules may be found at the bottom of this post.

(other info)

Previous Work

Related Links

TODO

Here is the list of modules that need attention, in a rough, best-effort priority order. Additional details (e.g. if there is an issue and/or PR) is found in the analysis table at the bottom.

  • builtins (high priority)
    • port _io
    • isolate _collections
    • port _tracemalloc
    • isolate faulthandler
  • essential (higher priority)
    • port _socket
    • port readline
    • isolate _ssl
    • port _pickle
    • port _datetime
    • isolate _asyncio
    • port _decimal
    • isolate array
    • port _ctypes
    • port _tkinter
    • isolate _multibytecodec
    • port _curses
    • isolate _curses_panel
    • isolate _elementtree
    • isolate pyexpat
    • port winreg (Windows)
    • port msvcrt (Windows)
  • non-essential (lower priority)
    • port winsound (Windows)
    • isolate _lsprof
  • deprecated (lower priority) (see PEP 594)
    • port _msi (Windows)
    • port ossaudiodev
    • isolate nis

The above does not include test modules. They don't need to be ported/isolated (except for a few which already have been).


Modules Analysis

module builtin Windows PEP 594 issue PR ported # static types # other global objects # other globals
_asyncio yes 2
_collections X (???) (branch) yes 7
_ctypes **NO** 37 6 4
_curses **NO** 1 2 4
_curses_panel yes 1
_datetime gh-71587 gh-102995 **NO** 7 10 1
_decimal **NO** 4 10 6
_elementtree yes 1
_io X gh-101819 gh-101520 **NO** 5
_lsprof yes 2
_msi Y X **NO** ??? ??? ???
_multibytecodec yes 23
_pickle (???) (yes) **NO** 5
_socket **NO** 1 2 3
_ssl (???) (branch) yes 1
_tkinter **NO** 8 9
_tracemalloc X gh-101520 **NO** 6 7
array yes 1
faulthandler X gh-101509 yes 3+ ~22+
msvcrt Y **NO** ??? ??? ???
nis X yes 1
ossaudiodev X NO 2 1
pyexpat yes 1
readline **NO** 9
winreg Y **NO** ??? ??? ???
winsound Y **NO** ??? ??? ???
test/example modules

These can be ported/isolated but don't have to be. They are the lowest priority.

module issue PR ported # static types # other global objects # other globals
xxmodule 3 1
xxsubtype 2
xxlimited_35 2
...
...

Linked PRs

@ericsnowcurrently ericsnowcurrently added type-feature A feature request or enhancement extension-modules C modules in the Modules dir 3.12 new features, bugs and security fixes labels Mar 28, 2023
@ericsnowcurrently
Copy link
Member Author

Ideally we'll get most of this done for 3.12. FWIW, there isn't enough time if it's just me, so consider this a call out to whoever can help. 😄 Feel free to adjust the TODO list and (hidden) "analysis" table above. Also feel free to reach out to any others that might be interested in pitching in here. Thanks!!!

@erlend-aasland, @corona10, @kumaraditya303, etc.

@ericsnowcurrently
Copy link
Member Author

@terryjreedy (for _tkinter)

@erlend-aasland
Copy link
Contributor

I've got WIP branches for datetime, io, pickle, collections, and ssl. The three former are up as draft PRs.

(FYI, I have quite a bit of CPython time scheduled for next week.)

@ericsnowcurrently
Copy link
Member Author

@markshannon

@erlend-aasland
Copy link
Contributor

Regarding the freelists in _asyncio, Kumar and Guido had some thoughts over at #91375 (comment).

@erlend-aasland
Copy link
Contributor

Elementtree was isolated/ported in gh-92123.

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Mar 28, 2023

_lsprof is also done, AFAICS. ... just noticed rotatingtree.c; sorry 'bout the noise.

@ericsnowcurrently
Copy link
Member Author

Thanks for the updates!

@terryjreedy
Copy link
Member

@serhiy-storchaka for _tkinter. I have nothing to do with it.

@erlend-aasland
Copy link
Contributor

AFAICS, array should be fine; array_reconstructor is in the module state already.

@corona10
Copy link
Member

I will try to find which modules that I can support for this issue :)

@erlend-aasland
Copy link
Contributor

I will try to find which modules that I can support for this issue :)

Feel free to pick up any of my draft PRs! :)

@CharlieZhao95
Copy link
Contributor

I tried submitting some PRs(Port _datatime module, Convert _pickle module to use heap types) for multi-phase initialization last year (delayed due to PEP 687 not being accepted at that time).
Maybe now that I can start something new! Can I try to work for _decimal module, if no one has claimed it yet.

@shihai1991
Copy link
Member

I tried submitting some PRs(Port _datatime module, Convert _pickle module to use heap types) for multi-phase initialization last year (delayed due to PEP 687 not being accepted at that time).

Looks like you can reopen your PRs :)

@erlend-aasland
Copy link
Contributor

Maybe now that I can start something new! Can I try to work for _decimal module, if no one has claimed it yet.

Sure you can! I started doing some work on _decimal some weeks ago, but it is very far from complete; feel free to start with that branch :) https://github.com/erlend-aasland/cpython/tree/isolate-decimal

@kumaraditya303
Copy link
Contributor

I volunteer to review the PRs as I have been doing, and leave _asyncio for me to handle.

kumaraditya303 added a commit that referenced this issue Apr 4, 2023
Co-authored-by: Mohamed Koubaa <koubaa.m@gmail.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
aisk added a commit to aisk/cpython that referenced this issue Apr 4, 2023
aisk added a commit to aisk/cpython that referenced this issue Apr 4, 2023
aisk added a commit to aisk/cpython that referenced this issue Apr 4, 2023
aisk added a commit to aisk/cpython that referenced this issue Apr 4, 2023
@ericsnowcurrently
Copy link
Member Author

FYI, I asked @Yhg1s (as 3.12 release manager) for his disposition on what changes are acceptable after beta 1, since there's certainly a chance we won't be able to finish with the above list in time. He gave me the following guidance related to the work here on extension modules:

  • no changes in module semantics after beta 1
  • no non-internal C-API added after beta 1
  • internal refactoring without semantics changes are okay after beta 1 but should be wrapped up ASAP so it all has time to bake
  • as usual, bug fixes are completely okay after beta 1

Feel free to clarify/correct my summary, Thomas. 😄

With the above in mind, I'd like to be sure we have the proper priority for the modules that are left to port/isolate. To that end, here are some questions:

  • which modules are the most likely to require any change in semantics (regardless of how small)?
  • is there any aspect of porting modules to multi-phase init that changes module semantics? (I'm not aware of any.)
  • are there any modules that will need any new non-internal API? (I'm guessing there aren't.)

Again, ideally we will be able to wrap up this work in time for beta 1 (May 8).

@erlend-aasland
Copy link
Contributor

For readline, I'm wondering if we should use a similar strategy as with signal and faulthandler. Thoughts?

gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this issue Apr 8, 2023
Co-authored-by: Mohamed Koubaa <koubaa.m@gmail.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
@aisk
Copy link
Contributor

aisk commented Apr 9, 2023

I had took a look for _ssl, the only global state is a lock to prevent concurrency writes on log files, and according to its document:

/* Allocate a static lock to synchronize writes to keylog file.
* The lock is neither released on exit nor on fork(). The lock is
* also shared between all SSLContexts although contexts may write to
* their own files. IMHO that's good enough for a non-performance
* critical debug helper.
*/
, the lock should be per process. Make it per interpreter will leads to concurrency issue.

Should we just leave it as it is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 new features, bugs and security fixes extension-modules C modules in the Modules dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

8 participants