Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39503: Fix urllib basic auth regex #18284

Open
wants to merge 2 commits into
base: master
from

Conversation

@vstinner
Copy link
Member

vstinner commented Jan 30, 2020

The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking.

Vulnerability reported by Matt Schwager.

https://bugs.python.org/issue39503

The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking.

Vulnerability reported by Matt Schwager.
@vstinner

This comment has been minimized.

Copy link
Member Author

vstinner commented Jan 30, 2020

@mschwager

This comment has been minimized.

Copy link

mschwager commented Jan 30, 2020

This fix looks good to me!

@@ -937,7 +937,7 @@ class AbstractBasicAuthHandler:

# allow for double- and single-quoted realm values
# (single quotes are a violation of the RFC, but appear in the wild)
rx = re.compile('(?:.*,)*[ \t]*([^ \t]+)[ \t]+'
rx = re.compile('(?:[^,]*,)*[ \t]*([^ \t]+)[ \t]+'

This comment has been minimized.

Copy link
@serhiy-storchaka

serhiy-storchaka Jan 30, 2020

Member

(?:.*,)* is equivalent to (?:.*,)?.

But since this regular expresion is only used with search(). (?:.*,)*[ \t]* can be removed at all.

I'll analyze whether it is correct or there is an error in the regular expression.

This comment has been minimized.

Copy link
@serhiy-storchaka

serhiy-storchaka Jan 30, 2020

Member

Well, I am cannot say that I completely understand the code, but to give it some sense we can either

  1. Replace rx.search() with rx.match() and replace (?:.*,)* with (?:.*,)?.

or

  1. Keep rx.search() and replace (?:.*,)* with (?:^|,).

Do not keep (?:[^,]*,)*. It is a waster of resources.

This comment has been minimized.

Copy link
@serhiy-storchaka

serhiy-storchaka Jan 30, 2020

Member

Humm, options 1 and 2 are not equivalent if the field value contains more than one challenge. Option 2 is closer to the current behavior. But correct support of more than one challenge need rewriting the code.

https://tools.ietf.org/html/rfc7235#section-4.1

@serhiy-storchaka

This comment has been minimized.

Copy link
Member

serhiy-storchaka commented Jan 30, 2020

Would be nice to add a test.

@mschwager

This comment has been minimized.

Copy link

mschwager commented Jan 30, 2020

Just a heads up, CVE-2020-8492 has been created. I'm not sure how Python CVEs are generally tracked, but it may be useful to include the information on the bug tracker issue 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.