Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[subinterpreters] Static types incorrectly share some objects between interpreters #94673

Closed
ericsnowcurrently opened this issue Jul 7, 2022 · 13 comments
Assignees
Labels
3.12 new features, bugs and security fixes topic-subinterpreters type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Jul 7, 2022

While static types (PyTypeObject values) don't themselves ever change (once PyType_Ready() has run), they do hold mutable data. This means they cannot be safely shared between multiple interpreters without a common GIL (and without state leaking between).

Mutable data:

  • the object header (e.g. refcount), as with all objects
  • otherwise immutable objects:
    • ob_type (__class__)
    • tp_base (__base__) - set by PyType_Ready() if not set
    • tp_bases (__bases__) - always set by PyType_Ready()
    • tp_mro (__mro__) - always set by PyType_Ready()
    • tp_dict (__dict__) - set by PyType_Ready() if not set
  • mutable containers:
    • tp_subclasses (__subclasses__)
    • tp_weaklist

(See https://docs.python.org/3/c-api/typeobj.html#tp-slots.)
(Note that tp_cache is no longer used.)

For the object header, if PEP 683 (immortal objects) is accepted then we can make static types immortal.

For the otherwise immutable objects, we can make sure each is immortal and then we're good. Even tp_dict is fine since it gets hidden behind types.MappingProxyType. We'd also need to either make sure each contained item is immortal.

For tp_subclasses we will need a per-interpreter copy, and do the proper lookup in the __subclasses__ getter. The cache could be stored on PyInterpreterState or even as a dict in tp_subclasses.

For tp_weaklist it's a similar story as for tp_subclasses. Note that tp_weaklist isn't very important for static types since they are never deallocated.

(The above is also discussed in PEP 684.)

CC @kumaraditya303

Linked PRs

@ericsnowcurrently ericsnowcurrently added the type-feature A feature request or enhancement label Jul 7, 2022
@ericsnowcurrently ericsnowcurrently self-assigned this Jul 7, 2022
@ericsnowcurrently
Copy link
Member Author

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Jul 10, 2022

The current _PyInterpreterState_GET (which would be used here) static inline function is quite inefficient as there is a lot indirection and involves atomic reads to get the "current" thread and then get the interpreter state. To improve this we can use thread local 1 variables on supported platform such as Linux, Windows and macOS. On Linux with glibc, the thread local implementation uses fs segment register and on x86-64 it is very efficient compared to the current implementation.
This isn't required here but would be nice to investigate separately.

Footnotes

  1. https://en.cppreference.com/w/c/thread/thread_local

@encukou
Copy link
Member

encukou commented Jul 11, 2022

This means they cannot be safely shared between multiple interpreters without a common GIL

Python currently has a common GIL.
Shouldn't this be part of a PEP, rather than an issue?

@ericsnowcurrently
Copy link
Member Author

I have the changes ready for this:

@ericsnowcurrently
Copy link
Member Author

Shouldn't this be part of a PEP, rather than an issue?

This is covered by PEP 684. The issue exists for implementation discussion and to anchor PRs.

@ericsnowcurrently
Copy link
Member Author

This isn't required here but would be nice to investigate separately.

This has been discussed elsewhere. Also see #29228.

FYI, for this PR (and in most cases) the performance penalty is insignificant.

ericsnowcurrently added a commit that referenced this issue Jul 25, 2022
This is the first of several precursors to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types.

We do the following:

* add `_PyStaticType_InitBuiltin()`
* add `_Py_TPFLAGS_STATIC_BUILTIN`
* set it on all static builtin types in `_PyStaticType_InitBuiltin()`
* shuffle some code around to be able to use _PyStaticType_InitBuiltin()
    * rename `_PyStructSequence_InitType()` to `_PyStructSequence_InitBuiltinWithFlags()`
    * add `_PyStructSequence_InitBuiltin()`.
ericsnowcurrently added a commit that referenced this issue Jul 25, 2022
Static builtin types are finalized by calling _PyStaticType_Dealloc().  Before this change, we were skipping finalizing such a type if it still had subtypes (i.e. its tp_subclasses hadn't been cleared yet).  The problem is that types hold several heap objects, which leak if we skip the type's finalization.  This change addresses that.

For context, there's an old comment (from e9e3eab) that says the following:

   // If a type still has subtypes, it cannot be deallocated.
   // A subtype can inherit attributes and methods of its parent type,
   // and a type must no longer be used once it's deallocated.

However, it isn't clear that is actually still true.  Clearing tp_dict should mean it isn't a problem.

Furthermore, the only subtypes that might still be around come from extension modules that didn't clean them up when unloaded (i.e. extensions that do not implement multi-phase initialization, AKA PEP 489).  Those objects are already leaking, so this change doesn't change anything in that regard.  Instead, this change means more objects gets cleaned up that before.
@ericsnowcurrently ericsnowcurrently changed the title [subinterpreters] Static types cannot be safely shared between interpreters [subinterpreters] Static types incorrectly share some objects between interpreters Jul 25, 2022
ericsnowcurrently added a commit that referenced this issue Jul 26, 2022
This is the last precursor to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types.

Here we add per-type storage on PyInterpreterState, but only for the static builtin types.  This involves the following:

* add PyInterpreterState.types
   * move PyInterpreterState.type_cache to it
   * add a "num_builtins_initialized" field
   * add a "builtins" field (a static array big enough for all the static builtin types)
* add _PyStaticType_GetState() to look up a static builtin type's state
* (temporarily) add PyTypeObject.tp_static_builtin_index (to hold the type's index into PyInterpreterState.types.builtins)

We will be eliminating tp_static_builtin_index in a later change.
@encukou
Copy link
Member

encukou commented Jul 27, 2022

This is covered by PEP 684. The issue exists for implementation discussion and to anchor PRs.

Please get the PEP accepted before merging anything you can't easily revert.

ericsnowcurrently added a commit that referenced this issue Jul 29, 2022
…95302)

* Store tp_weaklist on the interpreter state for static builtin types.

* Factor out _PyStaticType_GET_WEAKREFS_LISTPTR().

* Add _PyStaticType_ClearWeakRefs().

* Add a comment about how _PyStaticType_ClearWeakRefs() loops.

* Document the change.

* Update Doc/whatsnew/3.12.rst

* Fix a typo.
@ericsnowcurrently
Copy link
Member Author

gh-95302 slows things down ~1% so I'll need to address that. (See #95302 (comment).)

@ericsnowcurrently
Copy link
Member Author

ericsnowcurrently commented Aug 3, 2022

FYR, #44470 is where we dealt with leaking memory during finalization. It looks like we missed some cases there (relative to both tp_subclasses and tp_weaklist), which my PRs help to address.

@ericsnowcurrently
Copy link
Member Author

Also FTR, a per-interpreter GIL isn't the only motivator here. It's a good idea to keep a per-interpreter tp_subclasses, etc. Otherwise cls.__subclasses__ and weakref.getweakrefs(cls) will show the objects from all interpreters, which is a problem for users.

ericsnowcurrently added a commit that referenced this issue Aug 4, 2022
gh-95302 seems to have introduced a small performance regression. Here we make some minor changes to recover that lost performance.
@ericsnowcurrently
Copy link
Member Author

I'm marking this as done. If we determine we need to revisit this (e.g. for tp_dict) then we can create a new issue.

@ericsnowcurrently
Copy link
Member Author

FTR, it is hypothetically possible to modify tp_dict from Python code. See gh-88004.

iritkatriel pushed a commit to iritkatriel/cpython that referenced this issue Aug 11, 2022
pythongh-95302 seems to have introduced a small performance regression. Here we make some minor changes to recover that lost performance.
iritkatriel pushed a commit to iritkatriel/cpython that referenced this issue Aug 11, 2022
ericsnowcurrently added a commit that referenced this issue Mar 21, 2023
Aside from sys and builtins, _io is the only core builtin module that hasn't been ported to multi-phase init.  We may do so later (e.g. gh-101948), but in the meantime we must at least take care of the module's static types properly.  (This came up while working on gh-101660.)

#94673
Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this issue Mar 27, 2023
…102663)

Aside from sys and builtins, _io is the only core builtin module that hasn't been ported to multi-phase init.  We may do so later (e.g. pythongh-101948), but in the meantime we must at least take care of the module's static types properly.  (This came up while working on pythongh-101660.)

python#94673
warsaw pushed a commit to warsaw/cpython that referenced this issue Apr 11, 2023
…102663)

Aside from sys and builtins, _io is the only core builtin module that hasn't been ported to multi-phase init.  We may do so later (e.g. pythongh-101948), but in the meantime we must at least take care of the module's static types properly.  (This came up while working on pythongh-101660.)

python#94673
ericsnowcurrently added a commit that referenced this issue Apr 27, 2023
There were cases where we do unnecessary work for builtin static types. This also simplifies some work necessary for a per-interpreter GIL.
ericsnowcurrently added a commit that referenced this issue Apr 28, 2023
…3961)

This change has two small parts:

1. a follow-up to gh-103940 with one case I missed
2. adding a missing return that I noticed while working on related code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 new features, bugs and security fixes topic-subinterpreters type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

4 participants