Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-95778: CVE-2020-10735: Prevent DoS by very large int() #96499

Merged
merged 45 commits into from Sep 2, 2022

Commits on Sep 2, 2022

  1. CVE-2020-10735: Prevent DoS by very large int()

    The text to int parser of Python's int type is not safe against
    malicious input. Very large input strings with hundred thousands of
    digits can consume several seconds.
    
    The int() now limit the maximum amount of an input string to
    5,000 digits.
    
    For comparison total amount of protons in the observable universe is
    known as Eddington number. That number has 80 digits.
    
    Signed-off-by: Christian Heimes <christian@python.org>
    tiran authored and gpshead committed Sep 2, 2022
  2. fix typo

    tiran authored and gpshead committed Sep 2, 2022
  3. More docs (WIP)

    tiran authored and gpshead committed Sep 2, 2022
  4. Basic documentation for sys functions

    tiran authored and gpshead committed Sep 2, 2022
  5. Fix CI

    tiran authored and gpshead committed Sep 2, 2022
  6. Address Greg's review

    tiran authored and gpshead committed Sep 2, 2022
  7. Fix sys.flags len and docs

    tiran authored and gpshead committed Sep 2, 2022
  8. Keep the warning, but remove advice about limiting input length in th…

    …e json module.
    
    "It's advised to limit the input to a sensible length." isn't very helpful for the JSON module as, while technically true, the limit needed to avoid hitting things like the int<->str base 10 conversion this issue is about would make a significant percentage of actual JSON used in the real world impractical.
    
    we're limiting the int/str conversion length, that is the best that can be done here - unless someone wants to implement a feature request for JSON to reject int looking fields at a much smaller base10 length limit.
    gpshead committed Sep 2, 2022
  9. Renamed the APIs & too many other refactorings.

    The lumps way too many changes together in one commit, but extricating
    them into a series of individual pieces doesn't seem worthwhile at this
    stage, it could be done by picking and choosing into a new branch if
    there is a reason to do so.  Summarizing off the top of my head after
    rereading diffs too many times:
    
    - Renamed the environment variable, -X flag, and names of the new APIs.
    - Cleaned up the documentation.
    - Refactored and improved some tests.
    - Moved the definitions of the default and threshold to a non-public
      header.
    - Set the default to 2000 instead of disabled as the Steering Council
      agreed having a default limit made the most sense for our users.
    - Lowered the minimum threshold because there seems no harm in allowing
      people to have a lower limit.
    - Fixed the other-bases base-10 digits equivalent estimate to be
      consistent with the actual potential base-10 value instead of wildly
      off. Aimed for consistency of number size here rather than any attempt
      to match the performance between bases as that is easier for users to
      understand and doesn't change if our algorithm implementations change.
    - Left a couple of notes about what changes look good or not good for a
      security fix backported into a stable release.  It appears the PyConfig
      struct is a public API so we cannot change its definition mid-release.
      Backporting is going to involve something outside of the normal code
      path wise to plumb the new value through.  Ugh.
    
    Questions:
     * [ ] `test_embed` is failing in an odd manner... what's up with that?
     * [ ] Is `_pydecimal` important?  This change makes some of its APIs
       not work based on the values being used.  The unittest suite that
       tests both the C and Python Decimal implementations revealed this
       and needed a higher limit set.  If it relies on ints for arbitrary
       precision Decimal numbers, limits are frequently going to make it
       practically unusuable.
     * [ ] Backporting... we should produce what we believe a 3.11/3.10/3.9
       backport will look like in its own PR branch before making a final
       decision. The most annoying difficulty being the `struct PyConfig`
       issue.
     * [ ] I pondered having values >0 but <threshold just silently be
       rounded up to the threshold value instead of causing an error at
       Python startup time, but we do have other environment variables and
       flags that cause startup time errors based on their value so it
       doesn't seem to matter either way.  If we did that, the setter
       should also do that instead of raising.
    gpshead committed Sep 2, 2022
  10. Stop tying to base10, just use string digits.

    Renamed to `int_max_str_digits` and simplified the logic per @Y1hgs's
    comments on the earlier revision.  Less code, less awkward, and simpler
    to explain.
    
    Underscores and the sign are uncounted because that makes for the
    easiest implementation.
    gpshead committed Sep 2, 2022
  11. versionadded/changed to 3.12

    Playing it safe, if this lands in 3.11 before 3.11.0 these can be
    updated to 3.11.
    gpshead committed Sep 2, 2022
  12. Add a What's New entry.

    gpshead committed Sep 2, 2022
  13. Add a Misc/NEWS.d entry.

    gpshead committed Sep 2, 2022
  14. Undo addition to PyConfig to ease backporting.

    Release branches MUST NOT have `struct PyConfig` change.  Doing it this
    way initially for 3.12 simplifies the backporting process.  This commit
    is intended to be reverted in 3.12 after this lands in older releases.
    gpshead committed Sep 2, 2022
  15. Remove added unused imports.

    gpshead committed Sep 2, 2022
  16. Tabs -> Spaces

    gpshead committed Sep 2, 2022
  17. Raise the default limit and the threshold.

    Intended to minimize disruption. Based on real world testing across a
    huge Python code base including a large corpus of open source project
    test suites.
    gpshead committed Sep 2, 2022
  18. Remove xmlrpc.client changes, test-only.

    These aren't quite right and belong in their own feature request
    anyways.  The test validates that our overall limit triggers to
    acknowledge that the protection is desirable here.
    gpshead committed Sep 2, 2022
  19. Documentation cleanup.

    gpshead committed Sep 2, 2022
  20. Update attribution in Misc/NEWS.d

    Co-authored-by: Christian Heimes <christian@python.org>
    gpshead and tiran committed Sep 2, 2022
  21. Regen global strings

    tiran authored and gpshead committed Sep 2, 2022
  22. Fix the docs build.

    gpshead committed Sep 2, 2022
  23. Hexi hexa

    Co-authored-by: Steve Dower <steve.dower@microsoft.com>
    tiran and zooba committed Sep 2, 2022
  24. Hexi hexa 2

    Co-authored-by: Steve Dower <steve.dower@microsoft.com>
    tiran and zooba committed Sep 2, 2022