Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move _Py_RefTotal to PyInterpreterState #102304

Closed
ericsnowcurrently opened this issue Feb 27, 2023 · 18 comments
Closed

Move _Py_RefTotal to PyInterpreterState #102304

ericsnowcurrently opened this issue Feb 27, 2023 · 18 comments
Assignees
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-subinterpreters type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Feb 27, 2023

(See gh-100227.)

_Py_RefTotal holds the current global total number of refcounts. It only exists if Py_REF_DEBUG is defined (implied by Py_DEBUG). It is exposed by sys.gettotalrefcount() and set by Py_INCREF(), Py_DECREF(), etc. and _Py_NewReference().

Modications to _Py_RefTotal are currently protected by the GIL so it should be moved to PyInterpreterState. For various aspects of compatibility, it makes sense to keep the _Py_RefTotal symbol around (and correct) and keep returning the global total from sys.gettotalrefcount().

Also, _Py_RefTotal is used by stable ABI extensions only where Py_REF_DEBUG is defined (unlikely) and only where built against 3.9 or earlier. Just in case, though, we must still keep the global variable around, so any solution here must respect that.

Linked PRs

@ericsnowcurrently ericsnowcurrently added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters topic-C-API 3.12 bugs and security fixes labels Feb 27, 2023
@ericsnowcurrently
Copy link
Member Author

From #100227 (comment):

FYI, I have a plan for _Py_RefTotal that should preserve stable ABI compatibility without breaking interpreter isolation (and without relying on granular locks). Basically:

  • deprecate _Py_Reftotal but keep exporting the symbol (for stable ABI compatibility)
  • add _PyRuntimeState.object_state.last_legacy_reftotal
  • add _PyRuntimeState.object_state.reftotal
  • update everywhere to use the reftotal field (for simplicity, add #define _Py_RefTotal _PyRuntime.object_state.reftotal)

We would update _Py_GetRefTotal() to something like this:

Py_ssize_t
_Py_GetRefTotal(void)
{
    // _Py_GetLegacyRefTotal() returns the value of the actual `_Py_RefTotal` global.
    Py_ssize_t legacy = _Py_GetLegacyRefTotal();
    _PyRuntimeState.object_state.reftotal += legacy - _PyRuntimeState.object_state.last_legacy_reftotal;
    _PyRuntimeState.object_state.last_legacy_reftotal = legacy;
    return _PyRuntimeState.object_state.reftotal;
}

All this assumes that folks are not using _Py_RefTotal directly.


For a per-interpreter GIL, I see two options:

  1. add a granular lock around modifying _PyRuntimeState.object_state.reftotal, which would hurt the performance of incref/decref, but only on debug builds
  2. switch to PyInterpreterState.object_state.reftotal and aggregate the values from all interpreters in _Py_GetRefTotal()

@ericsnowcurrently ericsnowcurrently self-assigned this Feb 27, 2023
@ericsnowcurrently
Copy link
Member Author

Alternately, from #100227 (comment):

_Py_RefTotal is only defined if Py_REF_DEBUG is defined.
All updates to _Py_RefTotal can be made atomic (which will have terrible performance with multiple parallel interpreters) to preserve the API.

@kumaraditya303
Copy link
Contributor

I vote to make it atomic instead of per interpreter. I use this while debugging refleaks and making it per interp will effectively make it impossible to use. This is only used in debug builds so I am not too much concerned about performance.

ericsnowcurrently added a commit that referenced this issue Mar 8, 2023
This simplifies further changes to _Py_RefTotal (e.g. make it atomic or move it to PyInterpreterState).

#102304
@ericsnowcurrently
Copy link
Member Author

I use this while debugging refleaks and making it per interp will effectively make it impossible to use.

Do you modify _Py_RefTotal? Otherwise, would using _Py_GetRefTotal() be good enough?

carljm added a commit to carljm/cpython that referenced this issue Mar 8, 2023
* main:
  pythongh-102304: Consolidate Direct Usage of _Py_RefTotal (pythongh-102514)
  pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (in Objects/) (python#102218)
  pythongh-102507 Remove invisible pagebreak characters (python#102531)
  pythongh-102515: Remove unused imports in the `Lib/` directory (python#102516)
  Remove or update bitbucket links (pythonGH-101963)
  pythongh-101100: Fix sphinx warnings in `zipapp` and `zipfile` modules (python#102526)
  pythonGH-102397: Fix segfault from race condition in signal handling (python#102399)
  Fix style in argparse.rst (python#101733)
  Post 3.12.0a6
  fix typo in async generator code field name `ag_code` (python#102448)
  Python 3.12.0a6
@kumaraditya303
Copy link
Contributor

Do you modify _Py_RefTotal? Otherwise, would using _Py_GetRefTotal() be good enough?

I do that sometimes though that's not important. With this change is _Py_GetRefTotal functional even after runtime is finalized?

@ericsnowcurrently
Copy link
Member Author

With this change is _Py_GetRefTotal functional even after runtime is finalized?

We would definitely make sure that happens, since that's basically how it works now, but I don't think I did that in my branch. Doing so means that we'd have to track leaks during each subinterpreter finalization, which I don't think I've been doing. That's manageable though.

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Mar 14, 2023

We would definitely make sure that happens, since that's basically how it works now, but I don't think I did that in my branch. Doing so means that we'd have to track leaks during each subinterpreter finalization, which I don't think I've been doing. That's manageable though.

Have you tried making it atomic? It will preserve all existing behavior like working after runtime is finalized and would work for per interpreter GIL. This is important as otherwise we'll lose the ability to monitor refleaks across multiple runtime initializations .

@kumaraditya303
Copy link
Contributor

We would definitely make sure that happens, since that's basically how it works now, but I don't think I did that in my branch. Doing so means that we'd have to track leaks during each subinterpreter finalization, which I don't think I've been doing. That's manageable though.

I haven't looked at the implementation but I think there might be complications in this because some of the things at startup happens when GIL is not created so might need some special casing to get right.

@ericsnowcurrently
Copy link
Member Author

Have you tried making it atomic?

I started but haven't gotten back to it.

Regardless, I think it is worth having a per-interpreter reftotal for the following reasons:

  • it identifies which interpreter is leaking
  • it can help identify when an object is breaking isolation between interpreters
  • objects are interpreter-specific so it makes sense that the ref total be too
  • the obmalloc state is going to be per-interpreter, with the opportunity to report/investigate allocated blocks for each interpreter; it would be helpful for the reftotal to parallel that

otherwise we'll lose the ability to monitor refleaks across multiple runtime initializations .

What do you mean by this? Are you talking about objects that don't get freed during interpreter finalization? If so, how is multiple runtime initializations an exceptional case?

I think there might be complications in this because some of the things at startup happens when GIL is not created so might need some special casing to get right.

I'm not sure I follow. Please elaborate on how the GIL relates to tracking refleaks for each interpreter? Or are you talking about something else?

@kumaraditya303
Copy link
Contributor

What do you mean by this? Are you talking about objects that don't get freed during interpreter finalization? If so, how is multiple runtime initializations an exceptional case?

We need to make sure that the counter remains valid even after the runtime is finalized. If there are multiple runtime finalizations we need to continue using the old counter value and not initialize it to zero as we'll lose the old value. This is what I meant.

@kumaraditya303
Copy link
Contributor

I'm not sure I follow. Please elaborate on how the GIL relates to tracking refleaks for each interpreter? Or are you talking about something else?

While working on #101696 I noticed that the type cache is initialized before the gil is created which increfs None without GIL being held, I found this surprising hence I wanted to let you that you are aware of it, it is hidden in pycore_create_interpreter. This might not be relevant here but wanted to let you know just in case, from here ref total starts getting mutated.

@ericsnowcurrently
Copy link
Member Author

the type cache is initialized before the gil is created which increfs None without GIL being held

Yeah, I ran into this too, while working on something in the last week (gh-101660? gh-102343?).

@ericsnowcurrently
Copy link
Member Author

Ah, it was gh-102545 (668eb4c).

@ericsnowcurrently
Copy link
Member Author

@kumaraditya303, FYI, I've updated gh-102543 to preserve the reftotal across multiple runtime init/fini cycles.

I'm still not sure what the value of that is, though. The only case I can imagine is where you have a leak during the first runtime instance but not in the second and somehow you only check for leaks after the second. In my mind you'd check for leaks after each finalization. In fact, that's exactly what happens if with ./python -X showrefcount.

(Regardless, I certainly concede that there are possible use cases of which I'm not aware! 😄)

@ericsnowcurrently
Copy link
Member Author

@kumaraditya303, with the above in mind, do you have any objections to gh-102543 and to using _Py_GetGlobalRefTotal() (instead of using _Py_RefTotal directly)?

Note that if you're on board then move to per-interpreter (gh-102545) shouldn't be an issue.

@kumaraditya303
Copy link
Contributor

with the above in mind, do you have any objections to #102543 and to using _Py_GetGlobalRefTotal() (instead of using _Py_RefTotal directly)?

No objections to this. You can proceed with this 👍

@ericsnowcurrently
Copy link
Member Author

ericsnowcurrently commented Mar 20, 2023

Thanks! Your feedback was super helpful.

ericsnowcurrently added a commit that referenced this issue Mar 20, 2023
The essentially eliminates the global variable, with the associated benefits. This is also a precursor to isolating this bit of state to PyInterpreterState.

Folks that currently read _Py_RefTotal directly would have to start using _Py_GetGlobalRefTotal() instead.

#102304
vstinner added a commit to vstinner/cpython that referenced this issue Jun 6, 2023
…on#105345)

* Add "limited-c-api" and "stable-api" references.
* Rename "stable-abi-list" reference to "limited-api-list".
* Makefile: Document files regenerated by "make regen-limited-abi"
* Remove first empty line in generated files:

  - Lib/test/test_stable_abi_ctypes.py
  - PC/python3dll.c

(cherry picked from commit bae415a)
vstinner added a commit to vstinner/cpython that referenced this issue Jun 6, 2023
vstinner added a commit that referenced this issue Jun 6, 2023
…5345) (#105371)

* gh-102304: doc: Add links to Stable ABI and Limited C API (#105345)

* Add "limited-c-api" and "stable-api" references.
* Rename "stable-abi-list" reference to "limited-api-list".
* Makefile: Document files regenerated by "make regen-limited-abi"
* Remove first empty line in generated files:

  - Lib/test/test_stable_abi_ctypes.py
  - PC/python3dll.c

(cherry picked from commit bae415a)

* gh-102304: Fix up Simple ABI doc (GH-105351)

(cherry picked from commit 0202aa0)
vstinner added a commit to vstinner/cpython that referenced this issue Jun 6, 2023
)

When Python is built in debug mode (if the Py_REF_DEBUG macro is
defined), the Py_INCREF() and Py_DECREF() function are now always
implemented as opaque functions to avoid leaking implementation
details like the "_Py_RefTotal" variable or the
_Py_DecRefTotal_DO_NOT_USE_THIS() function.

* Remove _Py_IncRefTotal_DO_NOT_USE_THIS() and
  _Py_DecRefTotal_DO_NOT_USE_THIS() from the stable ABI.
* Remove _Py_NegativeRefcount() from limited C API.

(cherry picked from commit 92022d8)
vstinner added a commit that referenced this issue Jun 6, 2023
…105352)

gh-102304: Fix Py_INCREF() stable ABI in debug mode (#104763)

When Python is built in debug mode (if the Py_REF_DEBUG macro is
defined), the Py_INCREF() and Py_DECREF() function are now always
implemented as opaque functions to avoid leaking implementation
details like the "_Py_RefTotal" variable or the
_Py_DecRefTotal_DO_NOT_USE_THIS() function.

* Remove _Py_IncRefTotal_DO_NOT_USE_THIS() and
  _Py_DecRefTotal_DO_NOT_USE_THIS() from the stable ABI.
* Remove _Py_NegativeRefcount() from limited C API.

(cherry picked from commit 92022d8)
vstinner added a commit to vstinner/cpython that referenced this issue Jun 6, 2023
@vstinner
Copy link
Member

vstinner commented Jun 6, 2023

With PR #104763 (backported to Python 3.12), I excluded _Py_IncRefTotal_DO_NOT_USE_THIS() and _Py_DecRefTotal_DO_NOT_USE_THIS() from the stable ABI. They were used in a debug build of Python, it's no longer the case. The stable ABI now uses opaque function calls.

vstinner added a commit to vstinner/cpython that referenced this issue Jun 6, 2023
Python built in debug mode can no longer build C extensions using the
limited C API version 3.9 and older.
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 and older now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.
vstinner added a commit that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.
(cherry picked from commit 7ba0fd9)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
Py_INCREF() was made compatible again with Python 3.9 and older in
the limited API of Python debug mode.
vstinner added a commit that referenced this issue Jun 9, 2023
Py_INCREF() was made compatible again with Python 3.9 and older in
the limited API of Python debug mode.
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.

(cherry picked from commit 7ba0fd9)
vstinner added a commit to vstinner/cpython that referenced this issue Jun 9, 2023
Py_INCREF() was made compatible again with Python 3.9 and older in
the limited API of Python debug mode.

(cherry picked from commit 58e4b69)
vstinner added a commit that referenced this issue Jun 9, 2023
* gh-102304: Fix Py_INCREF() for limited C API 3.9 (#105550)

When Python is built in debug mode (Py_REF_DEBUG macro), Py_INCREF()
and Py_DECREF() of the limited C API 3.9 (and older) now call
Py_IncRef() and Py_DecRef(), since _Py_IncRef() and _Py_DecRef() were
added to Python 3.10.

(cherry picked from commit 7ba0fd9)

* gh-102304: Remove Py_INCREF() doc change (#105552)

Py_INCREF() was made compatible again with Python 3.9 and older in
the limited API of Python debug mode.

(cherry picked from commit 58e4b69)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-subinterpreters type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

3 participants