Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault with pydebug mode with Python 3.12.0a5+ #113793

Open
shadchin opened this issue Jan 7, 2024 · 0 comments
Open

Segfault with pydebug mode with Python 3.12.0a5+ #113793

shadchin opened this issue Jan 7, 2024 · 0 comments
Labels
type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@shadchin
Copy link
Contributor

shadchin commented Jan 7, 2024

Crash report

What happened?

(Sorry my English)

MRE:

  • Checkout 3.12 branch - git checkout 3.12
  • Build with pydebug mode - ./configure --with-pydebug && make
  • Create venv and activate it - ./python -m venv venv && . venv/bin/activate
  • Install packages - python -m pip install pytest pytest-asyncio grpcio
  • Create test_grpc.py
import grpc
import pytest


@pytest.fixture
async def grpc_channel():
    t = None

    try:
        1/0
    except Exception as e:
        t = e.__traceback__  # If replace on `pass` or `e.__traceback__ = None`, no segmentation fault

    async with grpc.aio.insecure_channel("localhost:8091") as channel:
        yield channel


async def test_say_hello(grpc_channel):
    pass
  • Run test - python -m pytest test_grpc.py
  • Segmentation fault
0x00005555558e8add in holds_gil (tstate=tstate@entry=0x555555da9040) at Python/pystate.c:357
357         _PyRuntimeState *runtime = tstate->interp->runtime;
(gdb) bt
#0  0x00005555558e8add in holds_gil (tstate=tstate@entry=0x555555da9040) at Python/pystate.c:357
#1  0x00005555558eb70c in PyGILState_Ensure () at Python/pystate.c:2232
#2  0x00007ffff590bc8c in ?? () from /home/shadchin/t/cpython/venv/lib/python3.12/site-packages/grpc/_cython/cygrpc.cpython-312-x86_64-linux-gnu.so
#3  0x00007ffff583f27c in ?? () from /home/shadchin/t/cpython/venv/lib/python3.12/site-packages/grpc/_cython/cygrpc.cpython-312-x86_64-linux-gnu.so
#4  0x0000555555774dc0 in cfunction_vectorcall_NOARGS (func=<built-in method _poll_wrapper of grpc._cython.cygrpc.PollerCompletionQueue object at remote 0x7ffff5ef5a90>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/methodobject.c:481
#5  0x0000555555708805 in _PyVectorcall_Call (tstate=tstate@entry=0x555555da9040, func=0x555555774a66 <cfunction_vectorcall_NOARGS>, callable=callable@entry=<built-in method _poll_wrapper of grpc._cython.cygrpc.PollerCompletionQueue object at remote 0x7ffff5ef5a90>, tuple=tuple@entry=(), kwargs=kwargs@entry={}) at Objects/call.c:271
#6  0x0000555555708bcb in _PyObject_Call (tstate=0x555555da9040, callable=callable@entry=<built-in method _poll_wrapper of grpc._cython.cygrpc.PollerCompletionQueue object at remote 0x7ffff5ef5a90>, args=args@entry=(), kwargs=kwargs@entry={}) at Objects/call.c:354
#7  0x0000555555708c23 in PyObject_Call (callable=callable@entry=<built-in method _poll_wrapper of grpc._cython.cygrpc.PollerCompletionQueue object at remote 0x7ffff5ef5a90>, args=args@entry=(), kwargs=kwargs@entry={}) at Objects/call.c:379
#8  0x00005555558754ec in _PyEval_EvalFrameDefault (tstate=tstate@entry=0x555555da9040, frame=0x7ffff6f5c110, throwflag=throwflag@entry=0) at Python/bytecodes.c:3254
#9  0x0000555555877f78 in _PyEval_EvalFrame (throwflag=0, frame=<optimized out>, tstate=0x555555da9040) at ./Include/internal/pycore_ceval.h:89
#10 _PyEval_Vector (tstate=0x555555da9040, func=0x7ffff7027a10, locals=locals@entry=0x0, args=0x7fffeeffcd88, argcount=1, kwnames=0x0) at Python/ceval.c:1683
#11 0x0000555555705a62 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:419
#12 0x000055555570a7dd in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x7fffeeffcd88, callable=<function at remote 0x7ffff7027a10>, tstate=0x555555da9040) at ./Include/internal/pycore_call.h:92
#13 method_vectorcall (method=<optimized out>, args=0x555555ce0fa0 <_PyRuntime+92288>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/classobject.c:69
#14 0x0000555555708805 in _PyVectorcall_Call (tstate=tstate@entry=0x555555da9040, func=0x55555570a369 <method_vectorcall>, callable=callable@entry=<method at remote 0x7ffff509a030>, tuple=tuple@entry=(), kwargs=kwargs@entry=0x0) at Objects/call.c:271
#15 0x0000555555708bcb in _PyObject_Call (tstate=0x555555da9040, callable=<method at remote 0x7ffff509a030>, args=(), kwargs=0x0) at Objects/call.c:354
#16 0x0000555555708c23 in PyObject_Call (callable=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at Objects/call.c:379
#17 0x000055555599bf11 in thread_run (boot_raw=boot_raw@entry=0x5555560ee070) at ./Modules/_threadmodule.c:1114
#18 0x0000555555908fb3 in pythread_wrapper (arg=<optimized out>) at Python/thread_pthread.h:233
#19 0x00007ffff7f9a609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#20 0x00007ffff7d65133 in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) info threads
  Id   Target Id            Frame
  1    LWP 2700610 "python" 0x0000555555920a34 in move_unreachable (young=young@entry=0x555555ce10c0 <_PyRuntime+92576>, unreachable=unreachable@entry=0x7fffffffdd60) at Modules/gcmodule.c:92
* 5    LWP 2700616 "python" 0x00005555558e8add in holds_gil (tstate=tstate@entry=0x555555da9040) at Python/pystate.c:357
(gdb) thread 1
[Switching to thread 1 (LWP 2700610)]
#0  0x0000555555920a34 in move_unreachable (young=young@entry=0x555555ce10c0 <_PyRuntime+92576>, unreachable=unreachable@entry=0x7fffffffdd60) at Modules/gcmodule.c:92
92          return (Py_ssize_t)(g->_gc_prev >> _PyGC_PREV_SHIFT);
(gdb) bt
#0  0x0000555555920a34 in move_unreachable (young=young@entry=0x555555ce10c0 <_PyRuntime+92576>, unreachable=unreachable@entry=0x7fffffffdd60) at Modules/gcmodule.c:92
#1  0x0000555555921795 in deduce_unreachable (unreachable=0x7fffffffdd60, base=0x555555ce10c0 <_PyRuntime+92576>) at Modules/gcmodule.c:1154
#2  gc_collect_main (tstate=0x555555d3e938 <_PyRuntime+475672>, generation=generation@entry=2, n_collected=n_collected@entry=0x0, n_uncollectable=n_uncollectable@entry=0x0, nofail=nofail@entry=1) at Modules/gcmodule.c:1242
#3  0x000055555592311a in _PyGC_CollectNoFail (tstate=tstate@entry=0x555555d3e938 <_PyRuntime+475672>) at Modules/gcmodule.c:2135
#4  0x00005555558da2ed in finalize_modules (tstate=tstate@entry=0x555555d3e938 <_PyRuntime+475672>) at Python/pylifecycle.c:1588
#5  0x00005555558e6e9a in Py_FinalizeEx () at Python/pylifecycle.c:1889
#6  0x000055555591fd60 in Py_RunMain () at Modules/main.c:711
#7  0x000055555591fdd2 in pymain_main (args=args@entry=0x7fffffffded0) at Modules/main.c:739
#8  0x000055555591fe97 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:763
#9  0x0000555555655766 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15
(gdb)

I make bisect, crash after commit 6036c3e

What's going on (maybe it will help):

  • Start test, create main thread and thread for AioChannel (AioChannel -> init_grpc_aio -> PollerCompletionQueue)
  • Test finish (it is pass), destroy main thread, but not destroy thread for AioChannel. (AioChannel has not been deallocated, it keep refcount > 0, because traceback refers to frame with it)
  • The child thread continues to work, it tries to gain accesststate->interp->runtime and segfault

There is probably a problem with the deallocation of tracebacks (frames).

I made a crutch to fix it (MRE and our real code stopped crashing), but didn't solve the root cause.

--- a/Python/pystate.c
+++ b/Python/pystate.c
@@ -354,6 +354,11 @@ holds_gil(PyThreadState *tstate)
     // XXX Fall back to tstate->interp->runtime->ceval.gil.last_holder
     // (and tstate->interp->runtime->ceval.gil.locked).
     assert(tstate != NULL);
+#ifndef NDEBUG
+    if (!tstate_is_alive(tstate)) {
+        return 0;
+    }
+#endif
     _PyRuntimeState *runtime = tstate->interp->runtime;
     /* Must be the tstate for this thread */
     assert(tstate == gilstate_tss_get(runtime));

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.12.1+ (heads/3.12:db6f297d44, Jan 7 2024, 18:26:09) [GCC 9.4.0]

@shadchin shadchin added the type-crash A hard crash of the interpreter, possibly with a core dump label Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

1 participant