Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-89727: Fix os.walk RecursionError on deep trees #99803

Merged
merged 16 commits into from Dec 19, 2022

Conversation

jonburdo
Copy link
Contributor

@jonburdo jonburdo commented Nov 27, 2022

Use a stack to implement os.walk iteratively instead of recursively to avoid hitting recursion limits on deeply nested trees.

@bedevere-bot
Copy link

bedevere-bot commented Nov 27, 2022

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@cpython-cla-bot
Copy link

cpython-cla-bot bot commented Nov 27, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

@careljonkhout
Copy link

careljonkhout commented Nov 29, 2022

These changes look good to me. I would sure like to see this issue fixed. Please add a news entry as suggested by the belverdere bot, so that this PR can move along.

Use a stack to implement os.walk iteratively instead of recursively to
avoid hitting recursion limits on deeply nested trees.
@carljm
Copy link
Contributor

carljm commented Dec 15, 2022

Just a note, it's better to push additional commits instead of force-pushing an amended commit. The latter invalidates previous reviews and generally makes it harder for reviewers to see the changes over time. The multiple commits will get squashed on merge anyway. (This also applies to updating from main: merge in main, don't rebase.)

Copy link
Contributor

@carljm carljm left a comment

Code changes LGTM.

I think a test should be added here that the recursion limit does not limit depth for os.walk, to prevent regression of the bug. A super-deep directory structure should not be needed for this test, because the test can temporarily set a low recursion limit. It looks like test.support.infinite_recursion can be used for this, although it's badly named for the purpose; should probably move the body of it to a new test.support.temp_recursion_limit function (with no default depth value) and make test.support.infinite_recursion a thin wrapper around it that just provides the default value of 75.

@carljm
Copy link
Contributor

carljm commented Dec 15, 2022

(It's not clear to me from the logs whether the Win x64 test failure is related or just flaky; if it fails again after update, then it deserves further investigation.)

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Dec 15, 2022

The Windows error (PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\a\\cpython\\cpython\\build\\test_python_6880�\\test_python_worker_1068�') looks like a flake to me; I've seen it randomly show up before.

Lib/os.py Outdated
@@ -343,86 +343,96 @@ def walk(top, topdown=True, onerror=None, followlinks=False):
return _walk(fspath(top), topdown, onerror, followlinks)

def _walk(top, topdown, onerror, followlinks):
Copy link
Contributor

@Ovsyanka83 Ovsyanka83 Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we remove the _walk function and simply put everything under walk because the second one was only necessary because of sys.audit not making sense for recursive calls.

Lib/os.py Outdated
return
stack = [(False, top)]
while stack:
is_result, top = stack.pop()
Copy link
Contributor

@Ovsyanka83 Ovsyanka83 Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the is_result name for some reason. I feel like we need something better but I cannot come up with anything yet.

In case we will not find a better name, I believe we should add a comment describing what this variable signifies.

Copy link
Contributor Author

@jonburdo jonburdo Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't sure what to call it. Maybe must_yield?

Copy link
Contributor

@Ovsyanka83 Ovsyanka83 Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like that would make a lot more sense, yeah. I used must_be_yielded_immediately in my PR but it's a questionable name.

Lib/os.py Outdated
if onerror is not None:
onerror(error)
return
stack = [(False, top)]
Copy link
Contributor

@Ovsyanka83 Ovsyanka83 Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it make more sense to use a deque here? They are optimized for appending and popping while having the same interface.

Copy link
Contributor

@carljm carljm Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when append and pop occur only at the end (as in this case), performance of list and deque is basically the same. When append/pop can occur at the start, then you need deque ("double-ended queue") to avoid O(n) copy.

Copy link
Contributor Author

@jonburdo jonburdo Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory I think a deque is a slightly better fit, since we're only popping and appending (at a certain point list has to reallocate right?), but I don't think there's much of a difference in this case. I changed it to a deque, but can change back to a list if that's preferred

Copy link
Contributor

@carljm carljm Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, list has to realloc occasionally, but proportional over-allocation means that this is amortized linear cost over a long sequence of appends (as in this use case). Deque in the same usage scenario has to repeatedly allocate new "chunks" of a fixed size, so this is also linear cost. I wrote a small microbenchmark modeling "hundreds of thousands of appends, then pop them all off" and there is no detectable performance difference on my machine. I personally don't think deque has any advantage here that makes it worth the import (which has to be inline to avoid a module-level dependency, adding some cost to every call.)

Copy link
Contributor Author

@jonburdo jonburdo Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good points. I just did some rough benchmarking as well and found no real difference.

Also, I didn't find the exact cause, but adding from collections import deque to the top of Lib/os.py seemed to cause test failures under test_embed and test_site. Moving the import inside of the walk function seems to work, but at that point I agree it's not worth it.

I'll switch back to list

@@ -2178,19 +2178,23 @@ def check_disallow_instantiation(testcase, tp, *args, **kwds):
testcase.assertRaisesRegex(TypeError, msg, tp, *args, **kwds)

@contextlib.contextmanager
def temp_recursion_limit(limit):
Copy link
Contributor

@Ovsyanka83 Ovsyanka83 Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that it's a contextmanager, I believe the better name would be "set_recursion_limit" similar to "redirect_stdout" because the nature of a contextmanager implies that it's temporary.

carljm
carljm approved these changes Dec 16, 2022
Copy link
Contributor

@carljm carljm left a comment

Modulo still somewhat preferring list over needing to import deque, the current version seems good to me.

@jonburdo
Copy link
Contributor Author

jonburdo commented Dec 17, 2022

Btw, to get a rough estimate, I did a quick performance test on my machine to compare main with this branch:

# commit d4052d835bd6f1b648d533dab8c228b6e3944651
pyperf timeit -s 'import os; p = os.path.expanduser("~")' 'list(os.walk(p))' -o main.json
# commit 507b650a7d8dc01b6115f434591238699ef8ad38
pyperf timeit -s 'import os; p = os.path.expanduser("~")' 'list(os.walk(p))' -o iterative-os-walk.json
pyperf compare_to main.json iterative-os-walk.json

This gives:

Mean +- std dev: [main] 4.04 sec +- 0.04 sec -> [iterative-os-walk] 3.76 sec +- 0.04 sec: 1.07x faster

Looks like I get around a 7% improvement.

@jonburdo jonburdo requested a review from Ovsyanka83 Dec 17, 2022
Copy link
Contributor

@Ovsyanka83 Ovsyanka83 left a comment

LGTM. I am still iffy about the must_yield variable but it's the lesser evil compared to any other name I thought of.

Thank you for your hard work!

@jonburdo
Copy link
Contributor Author

jonburdo commented Dec 17, 2022

LGTM. I am still iffy about the must_yield variable but it's the lesser evil compared to any other name I thought of.

Yeah, I couldn't think of anything better. Maybe we'll come up with a better name in the future

Thank you for your hard work!

Happy to do it! Thanks for the help :)

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Thanks, I have a few nits.

os.makedirs(os.path.join(self.walk_path, *(['d'] * 50)))
with set_recursion_limit(50):
all = list(self.walk(self.walk_path))
self.assertEqual(len(all), 54)
Copy link
Member

@JelleZijlstra JelleZijlstra Dec 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also test that the return value is correct?

Copy link
Contributor Author

@jonburdo jonburdo Dec 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea

Lib/os.py Outdated Show resolved Hide resolved
Lib/os.py Outdated Show resolved Hide resolved
Lib/os.py Outdated Show resolved Hide resolved
Lib/os.py Outdated
if topdown:
yield top, dirs, nondirs
# Traverse into sub-directories
islink, join = path.islink, path.join
Copy link
Member

@JelleZijlstra JelleZijlstra Dec 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're caching these functions in locals, might as well do it outside the main while loop to micro-optimize a bit more.

Lib/os.py Show resolved Hide resolved
Lib/os.py Outdated Show resolved Hide resolved
@python python deleted a comment from netlify bot Dec 18, 2022
@jonburdo jonburdo requested a review from JelleZijlstra Dec 19, 2022
@JelleZijlstra
Copy link
Member

JelleZijlstra commented Dec 19, 2022

From discussion on the issue, we shouldn't backport this, so I'm going to merge it into 3.12 (only) soon unless further feedback comes up.

Thanks for your contribution!

@JelleZijlstra JelleZijlstra self-assigned this Dec 19, 2022
@JelleZijlstra JelleZijlstra merged commit 797edb2 into python:main Dec 19, 2022
16 checks passed
@jonburdo jonburdo deleted the iterative-os-walk branch Dec 19, 2022
carljm added a commit to carljm/cpython that referenced this pull request Dec 19, 2022
* main:
  pythongh-89727: Fix os.walk RecursionError on deep trees (python#99803)
  Docs: Don't upload CI artifacts (python#100330)
  pythongh-94912: Added marker for non-standard coroutine function detection (python#99247)
  Correct CVE-2020-10735 documentation (python#100306)
  pythongh-100272: Fix JSON serialization of OrderedDict (pythonGH-100273)
  pythongh-93649: Split tracemalloc tests from _testcapimodule.c (python#99551)
  Docs: Use `PY_VERSION_HEX` for version comparison (python#100179)
  pythongh-97909: Fix markup for `PyMethodDef` members (python#100089)
  pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
  pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
  pythonGH-98831: Add DECREF_INPUTS(), expanding to DECREF() each stack input (python#100205)
  pythongh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (pythonGH-94469)
@Ovsyanka83
Copy link
Contributor

Ovsyanka83 commented Dec 19, 2022

That was an incredibly fast merge. Congrats!

jonburdo added a commit to jonburdo/cpython that referenced this pull request Dec 20, 2022
Use a stack to implement os.walk iteratively instead of recursively to
avoid hitting recursion limits on deeply nested trees.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants