New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-95778: CVE-2020-10735: Prevent DoS by very large int() #96499
Commits on Sep 2, 2022
-
CVE-2020-10735: Prevent DoS by very large int()
The text to int parser of Python's int type is not safe against malicious input. Very large input strings with hundred thousands of digits can consume several seconds. The int() now limit the maximum amount of an input string to 5,000 digits. For comparison total amount of protons in the observable universe is known as Eddington number. That number has 80 digits. Signed-off-by: Christian Heimes <christian@python.org>
-
-
-
-
-
-
-
Keep the warning, but remove advice about limiting input length in th…
…e json module. "It's advised to limit the input to a sensible length." isn't very helpful for the JSON module as, while technically true, the limit needed to avoid hitting things like the int<->str base 10 conversion this issue is about would make a significant percentage of actual JSON used in the real world impractical. we're limiting the int/str conversion length, that is the best that can be done here - unless someone wants to implement a feature request for JSON to reject int looking fields at a much smaller base10 length limit.
-
Renamed the APIs & too many other refactorings.
The lumps way too many changes together in one commit, but extricating them into a series of individual pieces doesn't seem worthwhile at this stage, it could be done by picking and choosing into a new branch if there is a reason to do so. Summarizing off the top of my head after rereading diffs too many times: - Renamed the environment variable, -X flag, and names of the new APIs. - Cleaned up the documentation. - Refactored and improved some tests. - Moved the definitions of the default and threshold to a non-public header. - Set the default to 2000 instead of disabled as the Steering Council agreed having a default limit made the most sense for our users. - Lowered the minimum threshold because there seems no harm in allowing people to have a lower limit. - Fixed the other-bases base-10 digits equivalent estimate to be consistent with the actual potential base-10 value instead of wildly off. Aimed for consistency of number size here rather than any attempt to match the performance between bases as that is easier for users to understand and doesn't change if our algorithm implementations change. - Left a couple of notes about what changes look good or not good for a security fix backported into a stable release. It appears the PyConfig struct is a public API so we cannot change its definition mid-release. Backporting is going to involve something outside of the normal code path wise to plumb the new value through. Ugh. Questions: * [ ] `test_embed` is failing in an odd manner... what's up with that? * [ ] Is `_pydecimal` important? This change makes some of its APIs not work based on the values being used. The unittest suite that tests both the C and Python Decimal implementations revealed this and needed a higher limit set. If it relies on ints for arbitrary precision Decimal numbers, limits are frequently going to make it practically unusuable. * [ ] Backporting... we should produce what we believe a 3.11/3.10/3.9 backport will look like in its own PR branch before making a final decision. The most annoying difficulty being the `struct PyConfig` issue. * [ ] I pondered having values >0 but <threshold just silently be rounded up to the threshold value instead of causing an error at Python startup time, but we do have other environment variables and flags that cause startup time errors based on their value so it doesn't seem to matter either way. If we did that, the setter should also do that instead of raising.
-
-
Stop tying to base10, just use string digits.
Renamed to `int_max_str_digits` and simplified the logic per @Y1hgs's comments on the earlier revision. Less code, less awkward, and simpler to explain. Underscores and the sign are uncounted because that makes for the easiest implementation.
-
-
-
-
Playing it safe, if this lands in 3.11 before 3.11.0 these can be updated to 3.11.
-
-
-
-
Undo addition to PyConfig to ease backporting.
Release branches MUST NOT have `struct PyConfig` change. Doing it this way initially for 3.12 simplifies the backporting process. This commit is intended to be reverted in 3.12 after this lands in older releases.
-
-
-
-
-
-
Raise the default limit and the threshold.
Intended to minimize disruption. Based on real world testing across a huge Python code base including a large corpus of open source project test suites.
-
Remove xmlrpc.client changes, test-only.
These aren't quite right and belong in their own feature request anyways. The test validates that our overall limit triggers to acknowledge that the protection is desirable here.
-
-
-
-
-
-
-
-
-
-
Update attribution in Misc/NEWS.d
Co-authored-by: Christian Heimes <christian@python.org>
-
-
-
-
-
-
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
-
Co-authored-by: Steve Dower <steve.dower@microsoft.com>