New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-89727: Fix os.walk RecursionError on deep trees #99803
Conversation
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
These changes look good to me. I would sure like to see this issue fixed. Please add a news entry as suggested by the belverdere bot, so that this PR can move along. |
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.
Just a note, it's better to push additional commits instead of force-pushing an amended commit. The latter invalidates previous reviews and generally makes it harder for reviewers to see the changes over time. The multiple commits will get squashed on merge anyway. (This also applies to updating from main: merge in main, don't rebase.) |
Code changes LGTM.
I think a test should be added here that the recursion limit does not limit depth for os.walk
, to prevent regression of the bug. A super-deep directory structure should not be needed for this test, because the test can temporarily set a low recursion limit. It looks like test.support.infinite_recursion
can be used for this, although it's badly named for the purpose; should probably move the body of it to a new test.support.temp_recursion_limit
function (with no default depth value) and make test.support.infinite_recursion
a thin wrapper around it that just provides the default value of 75.
(It's not clear to me from the logs whether the Win x64 test failure is related or just flaky; if it fails again after update, then it deserves further investigation.) |
The Windows error ( |
Lib/os.py
Outdated
@@ -343,86 +343,96 @@ def walk(top, topdown=True, onerror=None, followlinks=False): | |||
return _walk(fspath(top), topdown, onerror, followlinks) | |||
|
|||
def _walk(top, topdown, onerror, followlinks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we remove the _walk
function and simply put everything under walk
because the second one was only necessary because of sys.audit not making sense for recursive calls.
Lib/os.py
Outdated
return | ||
stack = [(False, top)] | ||
while stack: | ||
is_result, top = stack.pop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the is_result
name for some reason. I feel like we need something better but I cannot come up with anything yet.
In case we will not find a better name, I believe we should add a comment describing what this variable signifies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't sure what to call it. Maybe must_yield
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like that would make a lot more sense, yeah. I used must_be_yielded_immediately
in my PR but it's a questionable name.
Lib/os.py
Outdated
if onerror is not None: | ||
onerror(error) | ||
return | ||
stack = [(False, top)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it make more sense to use a deque here? They are optimized for appending and popping while having the same interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when append and pop occur only at the end (as in this case), performance of list and deque is basically the same. When append/pop can occur at the start, then you need deque ("double-ended queue") to avoid O(n) copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory I think a deque is a slightly better fit, since we're only popping and appending (at a certain point list
has to reallocate right?), but I don't think there's much of a difference in this case. I changed it to a deque, but can change back to a list if that's preferred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, list has to realloc
occasionally, but proportional over-allocation means that this is amortized linear cost over a long sequence of appends (as in this use case). Deque in the same usage scenario has to repeatedly allocate new "chunks" of a fixed size, so this is also linear cost. I wrote a small microbenchmark modeling "hundreds of thousands of appends, then pop them all off" and there is no detectable performance difference on my machine. I personally don't think deque has any advantage here that makes it worth the import (which has to be inline to avoid a module-level dependency, adding some cost to every call.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good points. I just did some rough benchmarking as well and found no real difference.
Also, I didn't find the exact cause, but adding from collections import deque
to the top of Lib/os.py
seemed to cause test failures under test_embed
and test_site
. Moving the import inside of the walk
function seems to work, but at that point I agree it's not worth it.
I'll switch back to list
Lib/test/support/__init__.py
Outdated
@@ -2178,19 +2178,23 @@ def check_disallow_instantiation(testcase, tp, *args, **kwds): | |||
testcase.assertRaisesRegex(TypeError, msg, tp, *args, **kwds) | |||
|
|||
@contextlib.contextmanager | |||
def temp_recursion_limit(limit): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that it's a contextmanager, I believe the better name would be "set_recursion_limit" similar to "redirect_stdout" because the nature of a contextmanager implies that it's temporary.
Btw, to get a rough estimate, I did a quick performance test on my machine to compare
This gives:
Looks like I get around a 7% improvement. |
Yeah, I couldn't think of anything better. Maybe we'll come up with a better name in the future
Happy to do it! Thanks for the help :) |
Lib/test/test_os.py
Outdated
os.makedirs(os.path.join(self.walk_path, *(['d'] * 50))) | ||
with set_recursion_limit(50): | ||
all = list(self.walk(self.walk_path)) | ||
self.assertEqual(len(all), 54) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also test that the return value is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good idea
Lib/os.py
Outdated
if topdown: | ||
yield top, dirs, nondirs | ||
# Traverse into sub-directories | ||
islink, join = path.islink, path.join |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're caching these functions in locals, might as well do it outside the main while loop to micro-optimize a bit more.
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
From discussion on the issue, we shouldn't backport this, so I'm going to merge it into 3.12 (only) soon unless further feedback comes up. Thanks for your contribution! |
* main: pythongh-89727: Fix os.walk RecursionError on deep trees (python#99803) Docs: Don't upload CI artifacts (python#100330) pythongh-94912: Added marker for non-standard coroutine function detection (python#99247) Correct CVE-2020-10735 documentation (python#100306) pythongh-100272: Fix JSON serialization of OrderedDict (pythonGH-100273) pythongh-93649: Split tracemalloc tests from _testcapimodule.c (python#99551) Docs: Use `PY_VERSION_HEX` for version comparison (python#100179) pythongh-97909: Fix markup for `PyMethodDef` members (python#100089) pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890) pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890) pythonGH-98831: Add DECREF_INPUTS(), expanding to DECREF() each stack input (python#100205) pythongh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (pythonGH-94469)
That was an incredibly fast merge. Congrats! |
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.
Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.