Skip to content

gh-80406: Finalise subinterpreters in Py_FinalizeEx() #17575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
23af5f5
Add test suggested by ncoghlan
LewisGaul Nov 21, 2019
433663c
Finalise sub-interpreters in Py_FinalizeEx()
LewisGaul Dec 11, 2019
48e1cfc
Improve test name
LewisGaul Dec 13, 2019
0400634
Switch back to main threadstate in test_audit_subinterpreter before c…
LewisGaul Dec 13, 2019
b79649c
📜🤖 Added by blurb_it.
blurb-it[bot] Dec 14, 2019
8b1e7d9
Markups including: switch from 'finalizing' flag to 'allow_new', add …
LewisGaul Jan 21, 2020
fd6073a
Merge branch 'finalise-subinterps' of github.com:LewisGaul/cpython in…
LewisGaul Jan 21, 2020
4bbd58f
Merge branch 'master' into finalise-subinterps
LewisGaul Oct 20, 2020
1095e66
Use '_' for unused variable in test_embed.py
LewisGaul Oct 20, 2020
675285d
Fix struct position of 'allow_new' flag
LewisGaul Oct 22, 2020
8e21788
Add handling for unsupported case of calling Py_Finalize() from a sub…
LewisGaul Oct 22, 2020
606c068
Emit resource warning when calling Py_Finalize() with unfinalized sub…
LewisGaul Oct 22, 2020
e0789b0
Update Py_FinalizeEx() docs
LewisGaul Oct 22, 2020
dda99ce
Update test for resource warning when implicitly finalizing subinterp…
LewisGaul Oct 23, 2020
847e8d2
Tidy up test_finalize_subinterps() testcase
LewisGaul Oct 23, 2020
a2fb0fc
Add testcase for calling Py_Finalize() from a subinterpreter
LewisGaul Oct 23, 2020
d234528
Tweak subinterpreters still running ResourceWarning handling
LewisGaul Nov 23, 2020
46a8619
Make calling PyFinalizeEx() from a subinterpreter a Py_FatalError
LewisGaul Nov 23, 2020
c89c0e5
Acquire interpreters mutex before setting allow_new=0 in PyFinalizeEx()
LewisGaul Nov 23, 2020
c285f52
Merge remote-tracking branch 'upstream/master' into finalise-subinterps
LewisGaul Nov 23, 2020
95cbfd4
Add back in the 'interp' variable to PyFinalizeEx() to fix the build
LewisGaul Nov 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions Doc/c-api/init.rst
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,8 @@ Initializing and finalizing the interpreter
Undo all initializations made by :c:func:`Py_Initialize` and subsequent use of
Python/C API functions, and destroy all sub-interpreters (see
:c:func:`Py_NewInterpreter` below) that were created and not yet destroyed since
the last call to :c:func:`Py_Initialize`. Ideally, this frees all memory
the last call to :c:func:`Py_Initialize`. A resource warning is emitted if
there were remaining sub-interpreters. Ideally, this frees all memory
allocated by the Python interpreter. This is a no-op when called for a second
time (without calling :c:func:`Py_Initialize` again first). Normally the
return value is ``0``. If there were errors during finalization
Expand All @@ -300,7 +301,8 @@ Initializing and finalizing the interpreter
freed. Some memory allocated by extension modules may not be freed. Some
extensions may not work properly if their initialization routine is called more
than once; this can happen if an application calls :c:func:`Py_Initialize` and
:c:func:`Py_FinalizeEx` more than once.
:c:func:`Py_FinalizeEx` more than once. Must be called from the main
interpreter.

.. audit-event:: cpython._PySys_ClearAuditHooks "" c.Py_FinalizeEx

Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_runtime.h
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ typedef struct pyruntimestate {
If that becomes a problem later then we can adjust, e.g. by
using a Python int. */
int64_t next_id;
int allow_new;
} interpreters;
// XXX Remove this field once we have a tp_* slot.
struct _xidregistry {
Expand Down
21 changes: 21 additions & 0 deletions Lib/test/test_embed.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,27 @@ def test_subinterps_distinct_state(self):
self.assertNotEqual(sub.tstate, main.tstate)
self.assertNotEqual(sub.modules, main.modules)

def test_finalize_subinterps(self):
"""
bpo-36225: Subinterpreters should implicitly be torn down by
Py_Finalize().
"""
_, err = self.run_embedded_interpreter("test_finalize_subinterps")
if support.verbose > 1:
print()
print(err)
self.assertIn("ResourceWarning: extra 2 interpreters", err)

def test_finalize_from_subinterp(self):
"""
bpo-38865: Py_Finalize() should not be called from a subinterpreter.
"""
_, err = self.run_embedded_interpreter("test_finalize_from_subinterp",
returncode=-6)
self.assertIn(
"Fatal Python error: Py_FinalizeEx: must be called from the main interpreter",
err)

def test_forced_io_encoding(self):
# Checks forced configuration of embedded interpreter IO streams
env = dict(os.environ, PYTHONIOENCODING="utf-8:surrogateescape")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
:func:`Py_FinalizeEx()` now implicitly cleans up subinterpreters, as the C API documentation suggests.
62 changes: 61 additions & 1 deletion Programs/_testembed.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
/*********************************************************
* Embedded interpreter tests that need a custom exe
*
* Executed via 'EmbeddingTests' in Lib/test/test_capi.py
* Executed via 'EmbeddingTests' in Lib/test/test_embed.py
*********************************************************/

/* Use path starting with "./" avoids a search along the PATH */
Expand Down Expand Up @@ -83,6 +83,60 @@ static int test_repeated_init_and_subinterpreters(void)
return 0;
}

/* bpo-36225: Implicitly tear down subinterpreters with Py_Finalize() */
static int test_finalize_subinterps(void)
{
PyThreadState *mainstate;
PyThreadState *interp_tstate;
PyGILState_STATE gilstate;
int i;

_testembed_Py_Initialize();
mainstate = PyThreadState_Get();

PyEval_ReleaseThread(mainstate);

gilstate = PyGILState_Ensure();
print_subinterp();
PyThreadState_Swap(NULL);

// Create 3 subinterpreters and destroy the last one.
for (i=0; i<3; i++) {
interp_tstate = Py_NewInterpreter();
print_subinterp();
}
PyThreadState_Swap(interp_tstate);
Py_EndInterpreter(interp_tstate);

// Switch back to the main interpreter and finalize the runtime.
PyThreadState_Swap(mainstate);
print_subinterp();
PyGILState_Release(gilstate);

PyEval_RestoreThread(mainstate);
Py_Finalize();

return 0;
}

/* bpo-38865: Py_Finalize() should not be called from a subinterpreter */
static int test_finalize_from_subinterp(void)
{
PyThreadState *subinterp_tstate;
int rc;

_testembed_Py_Initialize();
PyGILState_Ensure();
PyThreadState_Swap(NULL);

subinterp_tstate = Py_NewInterpreter();
PyThreadState_Swap(subinterp_tstate);

rc = Py_FinalizeEx();

return rc;
}

/*****************************************************
* Test forcing a particular IO encoding
*****************************************************/
Expand Down Expand Up @@ -1195,10 +1249,14 @@ static int test_audit_subinterpreter(void)
PySys_AddAuditHook(_audit_subinterpreter_hook, NULL);
_testembed_Py_Initialize();

PyThreadState *mainstate = PyThreadState_Get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to double-check with @zooba on his intention here. It's pretty important to make sure that the auditing functionality works as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zooba, as a consequence of my changes here, test_audit_subinterpreter() in _testembed.c started failing.

The change I'm making is to make Py_Finalize() implicitly clean up subinterpreters.

In the test, multiple subinterpreters are created, and then Py_Finalize() is called from the last-created subinterpreter. It seems there's currently an issue with calling Py_Finalize() from a subinterpreter (see bpo-37776), which caused this test to fail when getting Py_Finalize() to clean up subinterpreters.

The test passes if Py_Finalize() is instead called from the main interpreter tstate - which is the change I've made to the test. Just wanting to check whether that's taking anything away from what's intentionally being checked by this test?

Copy link
Contributor Author

@LewisGaul LewisGaul Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #19063 from @vstinner which was not merged, but also proposed to change the logic of this testcase. It seems like this testcase is doing something that is not currently working, and according to bpo-38865#msg357331 may not be supported in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also confirmed that this test still fails on my branch without this change.


Py_NewInterpreter();
Py_NewInterpreter();
Py_NewInterpreter();

// Currently unable to call Py_Finalize from subinterpreter thread, see bpo-37776.
PyThreadState_Swap(mainstate);
Py_Finalize();

switch (_audit_subinterpreter_interpreter_count) {
Expand Down Expand Up @@ -1707,6 +1765,8 @@ struct TestCase
static struct TestCase TestCases[] = {
{"test_forced_io_encoding", test_forced_io_encoding},
{"test_repeated_init_and_subinterpreters", test_repeated_init_and_subinterpreters},
{"test_finalize_subinterps", test_finalize_subinterps},
{"test_finalize_from_subinterp", test_finalize_from_subinterp},
{"test_pre_initialization_api", test_pre_initialization_api},
{"test_pre_initialization_sys_options", test_pre_initialization_sys_options},
{"test_bpo20891", test_bpo20891},
Expand Down
51 changes: 46 additions & 5 deletions Python/pylifecycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ _PyRuntime_Initialize(void)
return _PyStatus_OK();
}
runtime_initialized = 1;
_PyRuntime.interpreters.allow_new = 0;

return _PyRuntimeState_Init(&_PyRuntime);
}
Expand Down Expand Up @@ -1015,6 +1016,7 @@ init_interp_main(PyThreadState *tstate)
*/
if (is_main_interp) {
interp->runtime->initialized = 1;
interp->runtime->interpreters.allow_new = 1;
}
return _PyStatus_OK();
}
Expand Down Expand Up @@ -1086,6 +1088,7 @@ init_interp_main(PyThreadState *tstate)
}

interp->runtime->initialized = 1;
interp->runtime->interpreters.allow_new = 1;
}

if (config->site_import) {
Expand Down Expand Up @@ -1646,6 +1649,42 @@ Py_FinalizeEx(void)

/* Get current thread state and interpreter pointer */
PyThreadState *tstate = _PyRuntimeState_GetThreadState(runtime);
PyInterpreterState *interp = tstate->interp;

/* Check we're running in the main interpreter (not yet supported to call
* from any interpreter).
*/
if (interp != PyInterpreterState_Main()) {
Py_FatalError("must be called from the main interpreter\n");
}

// Finalize sub-interpreters.
PyThread_acquire_lock(runtime->interpreters.mutex, WAIT_LOCK);
runtime->interpreters.allow_new = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be safer to acquire _PyRuntime.interpreters.mutex beforing setting this variable. It may be better to move this code into pystate.c, since this file control the list of interpreters.

Copy link
Contributor Author

@LewisGaul LewisGaul Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume you're suggesting refactoring the 'finalize subinterpreters' logic? It would seem to me a function that does 'finalizing' belongs better in pylifecycle than pystate? I believe this refactoring was referred to by Eric above, where he says this can be addressed separately.

I've added in an acquisition of the lock here for now.

PyThread_release_lock(runtime->interpreters.mutex);
PyInterpreterState *curr_interp = PyInterpreterState_Head();
PyInterpreterState *next_interp;
int64_t num_destroyed = 0;
while (curr_interp != NULL) {
next_interp = PyInterpreterState_Next(curr_interp);
if (curr_interp != interp) {
PyThreadState_Swap(curr_interp->tstate_head);
Py_EndInterpreter(curr_interp->tstate_head);
num_destroyed++;
}
curr_interp = next_interp;
}
PyThreadState_Swap(tstate);

if (num_destroyed > 0) {
/* Sub-interpreters were still running, but should have be finalized
* before finalizing the runtime.
*/
if (PyErr_ResourceWarning(NULL, 1,
"extra %zd interpreters", num_destroyed)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why, but for some reason this warning isn't being output in the tests on Windows only. Anyone have any ideas?

_PyErr_WriteUnraisableMsg("in PyFinalizeEx", NULL);
}
}

// Wrap up existing "threading"-module-created, non-daemon threads.
wait_for_thread_shutdown(tstate);
Expand All @@ -1668,13 +1707,13 @@ Py_FinalizeEx(void)
/* Copy the core config, PyInterpreterState_Delete() free
the core config memory */
#ifdef Py_REF_DEBUG
int show_ref_count = tstate->interp->config.show_ref_count;
int show_ref_count = interp->config.show_ref_count;
#endif
#ifdef Py_TRACE_REFS
int dump_refs = tstate->interp->config.dump_refs;
int dump_refs = interp->config.dump_refs;
#endif
#ifdef WITH_PYMALLOC
int malloc_stats = tstate->interp->config.malloc_stats;
int malloc_stats = interp->config.malloc_stats;
#endif

/* Remaining daemon threads will automatically exit
Expand Down Expand Up @@ -1833,8 +1872,10 @@ new_interpreter(PyThreadState **tstate_p, int isolated_subinterpreter)
}
_PyRuntimeState *runtime = &_PyRuntime;

if (!runtime->initialized) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd not remove this condition initialized is separate from allow_new, so having this check here still makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for any confusion, @soltysh. _PyRuntimeState,interpreters.allow_new was added in this PR to solve the problem of interpreters being created when they shouldn't be (e.g. during runtime finalization). You could say the flag specifically means "new_interpreter() can be called currently". 😄 So the change here is correct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericsnowcurrently so runtime is always a single object (like an uber-object) and then you can create multiple interpreters. Right, but that still requires the runtime to be initialized. Even though the situation should not happen, because I'd assume the first invocation of python would initialize the runtime, it should not hurt having this here. Unless my thinking is wrong here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose your suggestion @soltysh would be to have two separate checks on runtime->initialized and runtime->interpreters.allow_new?

It's a while ago now, but I think my reasoning here was that this 'allow_new' flag encapsulates all information about whether a new interpreter can be created, so there should be no need to check things like whether the runtime is initialised (since 'allow_new' is only set to true when the runtime is initialised). Does that sounds reasonable?

I could be persuaded to change this if there are better suggestions :)

return _PyStatus_ERR("Py_Initialize must be called first");
if (!runtime->interpreters.allow_new) {
return _PyStatus_ERR(
"New interpreters cannot currently be created - Py_Initialize must "
"be called first, and Py_Finalize must not have been called");
}

/* Issue #10915, #15751: The GIL API doesn't work with multiple
Expand Down