New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack overflow collecting PGO data on Windows #113655
Comments
Trying to repro, I can't even manage to complete a PGO build (first step above). |
It sounds like you are repro'ing (I probably wasn't clear enough). It fails during the PGO build while running the tests that generate PGO data.
|
It looks like 45e09f9 introduced the failing test, so it could just be that the test is incorrect (rather than a test going from passing to failing). |
Ah, you're right, the output similar to yours scrolled off my screen. :-( You should be able to experiment with some of the numbers changed by the PR to see which one affects the Windows PGO build. The stack limit is definitely an ongoing discussion. |
Windows release buildbots are similarly failing: https://buildbot.python.org/all/#/builders/914/builds/3274/steps/4/logs/stdio |
So it is crashing in this addition to
My conclusion is that the increased default C recursion limit is too large for the platform and we hit a stack overflow. Maybe try setting Py_C_RECURSION_LIMIT to something smaller than 8000? (I am trying now with 4000.) |
Yes -- my experimentation arrived at that too. 5,000 is too large, but 4,000 seems to work. See #113668. |
Also, this docstring (deleted by that PR) suggests a possible cause:
Maybe we are running into the inlining problem even in non-debug mode? |
Let's put this on tomorrow's agenda. |
Since Windows 8, we've had If we have any references to stack variables that are passed between frames (e.g. a pointer to a local in EvalFrame that's available in the next EvalFrame) then we can get the size of each recursion from the current build. Of course, that'll vary based on how it recurses (and how many of our own native APIs it goes through), but it's likely better than guessing. At the very least, it might be helpful for some assertions to detect when the guesses are invalidated. A true solution would be to use structured exception handling and catch the stack overflow in EvalFrame. That's going to leave things in a pretty messy state though (leaked references at a minimum, and likely worse), so I think it's better to just crash. |
We could also just check this against the address of a local in a function and raise if we're within some distance from the extent of the stack. It won't be a predictable number of recursions, and it may still be possible to use native recursion to cause a crash, but it should be fairly simple to implement and isn't really any worse than the alternatives. |
Bug report
Bug description:
During a PGO build on Windows on main (471aa75), a test in
test_functools
fails with a stack overflow.Full log of build
CPython versions tested on:
CPython main branch
Operating systems tested on:
Windows
Linked PRs
The text was updated successfully, but these errors were encountered: