Skip to content

gh-116159: argparse: performance improvement parsing large number of options #116162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 1, 2024

Conversation

amyreese
Copy link
Contributor

@amyreese amyreese commented Feb 29, 2024

When parsing positional vs optional arguments, the use of min with a
list comprehension inside of a loop results in quadratic time based
on the number of optional arguments given. When combined with use of
prefix based argument files and a large number of optional flags, this
can result in extremely slow parsing behavior.

This replaces the min call with a simple loop with a short circuit to
break at the next optional argument.

Example test case that exercises this performance behavior:
https://gist.github.com/amyreese/b825bc092210ea7b64287f033dc995d8

The test case parses a list of arguments that contains 30,000 copies
of --flag=something. Using latest 3.12 or 3.13 from main, this takes
roughly 15-16 seconds to parse. With the proposed fix, this only takes
80 milliseconds.

Before:

$ ~/opt/cpython/bin/python3.13 -VV
Python 3.13.0a4+ (heads/main:0656509033, Feb 29 2024, 11:59:43) [Clang 15.0.0 (clang-1500.1.0.2.5)]

$ ~/opt/cpython/bin/python3.13 test_many_flags.py
parsed args in 15,195,914 µs

With fix:

$ ~/opt/cpython/bin/python3.13 -VV
Python 3.13.0a4+ (heads/argparse-options:968aa13a1d, Feb 29 2024, 13:34:49) [Clang 15.0.0 (clang-1500.1.0.2.5)]

$ ~/opt/cpython/bin/python3.13 test_many_flags.py
parsed args in 79,787 µs

Co-authored-by: Zsolt Dollenstein zsol.zsol@gmail.com

…er of options

When parsing positional vs optional arguments, the use of min with a
list comprehension inside of a loop results in quadratic time based
on the number of optional arguments given. When combined with use of
prefix based argument files and a large number of optional flags, this
can result in extremely slow parsing behavior.

This replaces the min call with a simple loop with a short circuit to
break at the next optional argument.

Example test case that exercises this performance behavior:
https://gist.github.com/amyreese/b825bc092210ea7b64287f033dc995d8

The test case parses a list of arguments that contains 30,000 copies
of `--flag=something`. Using latest 3.12 or 3.13 from main, this takes
roughly 15-16 seconds to parse. With the proposed fix, this only takes
80 milliseconds.

Before:

```
$ ~/opt/cpython/bin/python3.13 -VV
Python 3.13.0a4+ (heads/main:0656509033, Feb 29 2024, 11:59:43) [Clang 15.0.0 (clang-1500.1.0.2.5)]

$ ~/opt/cpython/bin/python3.13 test_many_flags.py
parsed args in 15,195,914 µs
```

With fix:

```
$ ~/opt/cpython/bin/python3.13 -VV
Python 3.13.0a4+ (heads/argparse-options:968aa13a1d, Feb 29 2024, 13:34:49) [Clang 15.0.0 (clang-1500.1.0.2.5)]

$ ~/opt/cpython/bin/python3.13 test_many_flags.py
parsed args in 79,787 µs
```

Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@erlend-aasland erlend-aasland merged commit 4a63098 into python:main Mar 1, 2024
@amyreese amyreese deleted the argparse-speedup branch March 1, 2024 19:04
woodruffw pushed a commit to woodruffw-forks/cpython that referenced this pull request Mar 4, 2024
…er of options (python#116162)

When parsing positional vs optional arguments, the use of min with a
list comprehension inside of a loop results in quadratic time based
on the number of optional arguments given. When combined with use of
prefix based argument files and a large number of optional flags, this
can result in extremely slow parsing behavior.

This replaces the min call with a simple loop with a short circuit to
break at the next optional argument.

Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this pull request Mar 6, 2024
Summary:
Upstream PR: python/cpython#116162

This improves performance when parsing a large number of options
from multiple seconds to milliseconds.

pikachu_dance

Reviewed By: fried

Differential Revision: D54553600

fbshipit-source-id: da1ca6d110e701bbd491465c00627f29a08af3c5
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 25, 2024
…er of options (python#116162)

When parsing positional vs optional arguments, the use of min with a
list comprehension inside of a loop results in quadratic time based
on the number of optional arguments given. When combined with use of
prefix based argument files and a large number of optional flags, this
can result in extremely slow parsing behavior.

This replaces the min call with a simple loop with a short circuit to
break at the next optional argument.

Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
diegorusso pushed a commit to diegorusso/cpython that referenced this pull request Apr 17, 2024
…er of options (python#116162)

When parsing positional vs optional arguments, the use of min with a
list comprehension inside of a loop results in quadratic time based
on the number of optional arguments given. When combined with use of
prefix based argument files and a large number of optional flags, this
can result in extremely slow parsing behavior.

This replaces the min call with a simple loop with a short circuit to
break at the next optional argument.

Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants