New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-113225: Speed up pathlib.Path.glob()
#113226
Conversation
Use `os.DirEntry.path` as the string representation of child paths, unless the parent path is empty, in which case we use the entry `name`.
Up to 15% faster in some simple tests: $ ./python -m timeit -s "from pathlib import Path" "list(Path().glob('**/*', follow_symlinks=True))"
5 loops, best of 5: 78.8 msec per loop # before
5 loops, best of 5: 67.9 msec per loop # after
# --> 1.16x faster
$ ./python -m timeit -s "from pathlib import Path" "list(Path.cwd().glob('**/*', follow_symlinks=True))"
5 loops, best of 5: 79.6 msec per loop # before
5 loops, best of 5: 70.5 msec per loop # after
# --> 1.13x faster edit: patch revised; test cases for |
How does it work for |
Same result but much slower (!), as the $ ./python -m timeit -s "from pathlib import Path" "[p.name for p in Path.cwd().iterdir()]"
2000 loops, best of 5: 139 usec per loop # before
1000 loops, best of 5: 371 usec per loop # after Will fix! Thanks. |
pathlib.Path.iterdir()
and glob()
pathlib.Path.glob()
I've undone the change to With |
I've merged the ABC-specific changes separately (#113556) so this PR should be pretty laser-focused now :D |
Patch further revised so that we continue to set $ ./python -m timeit -s "from pathlib import Path" "list(Path().glob('**/*', follow_symlinks=True))"
5 loops, best of 5: 84.3 msec per loop # before
5 loops, best of 5: 80.1 msec per loop # after
$ ./python -m timeit -s "from pathlib import Path" "list(Path.cwd().glob('**/*', follow_symlinks=True))"
5 loops, best of 5: 84.4 msec per loop # before
5 loops, best of 5: 82 msec per loop # after |
Use
os.DirEntry.path
as the string representation of child paths.Corner case: when we
os.scandir('.')
, thepath
atttribute will resemble'./foo'
, which isn't normalized by pathlib standards. In this case we use the entryname
, which looks like'foo'
.pathlib.Path.iterdir()
and friends by usingos.DirEntry.path
#113225