New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfaults when accessing module state in tp_dealloc
(itertools teedataobject clear)
#115874
Comments
Slightly different crash on current main on macOS. It fails an assertion
The crash happens during finalization. Apparently the module has already been finalized but the cc @erlend-aasland who worked on itertools module state (e.g., #101477). |
The crash goes away if I remove both of the Line 1797 in 200271c
|
That sounds really strange. The (heap) type keeps a strong reference to the module. |
Tiny reproducer: import typing, copyreg, itertools
copyreg.buggy_tee = itertools.tee(())
# Crash! Here's essentially the same thing, but also clearing every reference I can think of. import sys, gc, typing, copyreg, itertools
# Clear typing:
[f() for f in typing._cleanups]
typing.__dict__.clear()
del typing
# Clear copyreg...
copyreg.dispatch_table.clear()
copyreg._extension_registry.clear()
copyreg._inverted_registry.clear()
copyreg._extension_cache.clear()
copyreg.__dict__.clear()
# ...*except* for a cycle with itself and a reference to a tee:
copyreg.buggy_cycle = copyreg
copyreg.buggy_tee = itertools.tee(())[0]
# Clear itertools:
itertools.__dict__.clear()
# Clear all modules except __main__:
for module in list(sys.modules):
if module != "__main__":
del sys.modules[module]
# The only remaining non-cyclic references to copyreg and itertools are in this
# module's globals. copyreg is referenced by __main__ and itself:
assert gc.get_referrers(copyreg) == [globals(), copyreg.__dict__]
assert sys.getrefcount(copyreg) - 1 == len(gc.get_referrers(copyreg))
# itertools is referenced by __main__ and all of the itertools types:
assert gc.get_referrers(itertools)[0] == globals()
assert all(isinstance(t, type) and t.__module__ == "itertools" for t in gc.get_referrers(itertools)[1:])
assert sys.getrefcount(itertools) - 1 == len(gc.get_referrers(itertools))
# Crash! |
One thing I'm curious about (maybe others with GC or multi-phase init expertise can shed some light): how is the "type -> module -> type" cycle handled for these types? More concretely, it seems like there could be a few cycles here at shutdown:
Faced with these three cycles, it seems like it's only "correct" to clear the third, before moving on of the other two. But there's nothing stopping the GC from "picking" one of the other two to clear instead, which leads to this nasty situation. So it seems like relying on module state in Am I missing something? |
Confirming my hunch, the following patch fixes this for me: diff --git a/Modules/itertoolsmodule.c b/Modules/itertoolsmodule.c
index 164741495c..ab0f440bee 100644
--- a/Modules/itertoolsmodule.c
+++ b/Modules/itertoolsmodule.c
@@ -837,8 +837,7 @@ teedataobject_clear(teedataobject *tdo)
Py_CLEAR(tdo->values[i]);
tmp = tdo->nextlink;
tdo->nextlink = NULL;
- itertools_state *state = get_module_state_by_cls(Py_TYPE(tdo));
- teedataobject_safe_decref(tmp, state->teedataobject_type);
+ teedataobject_safe_decref(tmp, Py_TYPE(tdo));
return 0;
} I also think I found an unrelated refleak while chasing this down: diff --git a/Objects/typeobject.c b/Objects/typeobject.c
index fe3b7b87c8..181d032328 100644
--- a/Objects/typeobject.c
+++ b/Objects/typeobject.c
@@ -6549,6 +6549,7 @@ reduce_newobj(PyObject *obj)
}
else {
/* args == NULL */
+ Py_DECREF(copyreg);
Py_DECREF(kwargs);
PyErr_BadInternalCall();
return NULL; |
I can open a PR for this tomorrow if others confirm that my understanding is correct. |
I don't know enough to comment on the GC, but I can confirm that, for me:
|
I ran into this while trying to track down https://bugs.debian.org/1063345 (a test failure in Debian's |
I scanned through the other modules. https://github.com/python/cpython/blob/main/Modules/_asynciomodule.c#L1600-L1617 |
Here's a tiny reproducer for the import sys, copyreg, _asyncio
copyreg.buggy_cycle = copyreg
copyreg.buggy_futureiter = iter(_asyncio.Future())
del sys.modules["copyreg"]
# Crash!
Commenting out the |
tp_dealloc
AFAICS (on mobile now), that patch looks correct. Thanks! |
@brandtbucher The following printf test fitted your tee examples for me, which just reset the GC track. I agree that relying on a module state at the modules-finalization phase is not safe on the current convention, if I am not missing something. cc @encukou @ericsnowcurrently teedataobject_newinternal(itertools_state *state, PyObject *it)
{
teedataobject *tdo;
tdo = PyObject_GC_New(teedataobject, state->teedataobject_type);
...
PyObject_GC_Track(tdo);
+ PyTypeObject *type = Py_TYPE(tdo);
+ PyObject *module = PyType_GetModule(type);
+
+ // _PyObject_GC_UNTRACK(type); _PyObject_GC_TRACK(type); // (T)
+ // _PyObject_GC_UNTRACK(module); _PyObject_GC_TRACK(module); // (M)
...
}
teedataobject_dealloc(teedataobject *tdo)
{
PyTypeObject *tp = Py_TYPE(tdo);
+ printf("teedataobject_dealloc %p\n", ((PyHeapTypeObject *)tp)->ht_module);
...
}
itertoolsmodule_clear(PyObject *mod)
{
+ printf("itertoolsmodule_clear\n");
...
}
itertoolsmodule_free(void *mod)
{
+ printf("itertoolsmodule_free\n");
- (void)itertoolsmodule_clear((PyObject *)mod);
}
Another reproducer: # import teeholder, itertools # no segfault
import itertools, teeholder # use-after-itertools-free
teeholder.buggy_tee = itertools.tee(())
def foo():
pass |
I have somebody who’s interested in working on the |
Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
…ythonGH-116204) (cherry picked from commit e2fcaf1) Co-authored-by: Erlend E. Aasland <erlend@python.org> Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
…ython#116204) Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
…ython#116204) Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
tp_dealloc
tp_dealloc
(itertools teedataobject clear)
I found a very similar bug in the tmt project: teemtee/tmt#2501 (comment). It's good to know that it has already been fixed :-) tmt crash: the Python crash occurs in 3 steps:
|
Do you think it is possible to get the fix merged in time for the next 3.12 release? |
AFACT, this no longer repros on main but I'm not aware of any fixes that landed addressing the segfault. |
I bisected it down to commit 8bef34f, which modified the garbage collector. Also, |
…ython#116204) Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
(cherry picked from commit d8f3503) Co-authored-by: Savannah Ostrowski <savannahostrowski@gmail.com>
I suggest to open a new issue since this one is closed. |
Crash report
What happened?
Running the file from the terminal with
python3.12 minimal.py
is sufficient. When the interpreter exits, it segfaults.Crash does not occur in python 3.8-3.11, but does occur in 3.12 and 3.13. Crash not observed on Windows with 3.12.
Backtrace:
CPython versions tested on:
3.12
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.12.1 (main, Dec 18 2023, 00:00:00) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)]
Linked PRs
FutureIter_dealloc
#117741The text was updated successfully, but these errors were encountered: