Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upbpo-34155: Dont parse domains containing @ #13079
Conversation
This comment has been minimized.
This comment has been minimized.
the-knights-who-say-ni
commented
May 3, 2019
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. You can check yourself to see if the CLA has been received. Thanks again for your contribution, we look forward to reviewing it! |
41c6fe8
to
1b6269b
Thanks for the patch! I have some different ideas on how to do error recovery in this case, please see my comment on bpo-34155. |
@@ -1559,6 +1559,8 @@ def get_domain(value): | |||
token, value = get_dot_atom(value) | |||
except errors.HeaderParseError: | |||
token, value = get_atom(value) | |||
if value and value[0] == '@': | |||
raise errors.HeaderParseError('Multiple domains') |
This comment has been minimized.
This comment has been minimized.
@@ -0,0 +1 @@ | |||
Don't parse email domain containing an at, ie. a@malicious.org@important.com |
This comment has been minimized.
This comment has been minimized.
maxking
May 31, 2019
Contributor
I would suggest a slightly different wording for this:
Fix parsing of invalid email addresses with more than one ``@`` (e.g. a@b@c.com.) to not return the part before 2nd ``@`` as valid email address. Patch by jpic.
This comment has been minimized.
This comment has been minimized.
Pushed a couple of fixes in new commits in my branch, left a question for you. Thank you very much for your review @maxking ! You're max kind ;) |
This comment has been minimized.
This comment has been minimized.
As I mentioned in this comment on bpo I think that |
This comment has been minimized.
This comment has been minimized.
I totally agree on the fact that we should return tuple of two empty strings on "failed to parse", whether on not we decide to be opinionated about what we fail to parse. In this case security issue should return the same value. |
This comment has been minimized.
This comment has been minimized.
Understood, it was decided to deal with that later if it had to be dealt with at all anyway. I have fixed my own tests, but other tests fail since i changed the implementation and i'm currently digging into it... I lost track and started over from master which means that some commit history is lost. I could add maxking credit too in the news item and bpo commit message. |
84ca1e1
to
85ec1a8
This comment has been minimized.
This comment has been minimized.
Just one last question please, currently with this patch we get:
But when there is an invalid domain such as
Do you also want this patch to make Or is it out of the scope of this patch ? Or is Thanks a heap for your support ! Deeply appreciated ;) |
This comment has been minimized.
This comment has been minimized.
That seems to be a change of behaviour, even though it is going to align with what we say in docs. Perhaps for the the short-term, we should make the docs align a bit more with the reality. In the long term, we can decide if we want to return empty string where the email address is technically invalid. As it stands today, "failed to parse" means there is no way to read the value at all, so that includes weird encodings, random chars etc. I am not sure if we do RFC level validation for what is and isn't a valid address (maybe we do, haven't read those parts of the code too closely, but that is my current impression) so it may or may not be tricky to weed out all invalid addresses. That discussion can perhaps start on a new issue and follow the usual change-of-behavior process and should be fixed in a different patch. This being a security fix will go back to 3.6 as well, so I guess we shouldn't do that here.
It is an invalid address AFAIK, I can't quote the right RFC page to corroborate that right now, will need some more time for that. |
This comment has been minimized.
This comment has been minimized.
Thanks for your explanation. Please let me know if there's anything else you want me to do. If you want me to open the other issue I will start studying the corresponding contribution docs and try my best to match Python's standard expectations. Thanks for baring with me. |
This comment has been minimized.
This comment has been minimized.
@jpic I think it would be good to create a new issue to discuss if we want to change behaviour of |
This comment has been minimized.
This comment has been minimized.
There goes the new issue, turned out that domains ending with a dot are valid ! Good to know |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Do you also want PR against 3.6, 3.7, 3.8 since this has security implications ? |
This comment has been minimized.
This comment has been minimized.
I am not sure about the failing tests, maybe we can rebase this against master and hope that it is fixed there? :) I am currently travelling, so won't be able to help out with debugging much for this week. And no, you don't need to manually create backports, I added the comment for whoever merges this to add right "needs backport to xxx" labels, which would then trigger the bot to create backport PRs. |
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Jul 17, 2019
GH-14824 is a backport of this pull request to the 3.8 branch. |
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Jul 17, 2019
GH-14825 is a backport of this pull request to the 3.7 branch. |
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Jul 17, 2019
GH-14826 is a backport of this pull request to the 3.6 branch. |
This comment has been minimized.
This comment has been minimized.
@warsaw, do you still want the backports of this PR to be merged? They are still awaiting review and merging. |
This comment has been minimized.
This comment has been minimized.
@ned-deily It would be good to merge the backports for 3.6, I can merge the ones for 3.7 and 3.8. |
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155
This comment has been minimized.
This comment has been minimized.
miss-islington
commented
Aug 15, 2019
Thanks @jpic for the PR |
This comment has been minimized.
This comment has been minimized.
miss-islington
commented
Aug 15, 2019
I'm having trouble backporting to |
This comment has been minimized.
This comment has been minimized.
miss-islington
commented
Aug 15, 2019
Thanks @jpic for the PR |
This comment has been minimized.
This comment has been minimized.
miss-islington
commented
Aug 15, 2019
Sorry, @jpic, I could not cleanly backport this to |
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
https://bugs.python.org/issue34155 (cherry picked from commit 8cb65d1) Co-authored-by: jpic <jpic@users.noreply.github.com>
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155
This comment has been minimized.
This comment has been minimized.
bedevere-bot
commented
Sep 11, 2019
GH-16006 is a backport of this pull request to the 2.7 branch. |
Before: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='a', domain='malicious.org'),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@malicious.org') After: >>> email.message_from_string('From: a@malicious.org@important.com', policy=email.policy.default)['from'].addresses (Address(display_name='', username='', domain=''),) >>> parseaddr('a@malicious.org@important.com') ('', 'a@') https://bugs.python.org/issue34155
This change skips parsing of email addresses where domains include a "@" character, which can be maliciously used since the local part is returned as a complete address. (cherry picked from commit 8cb65d1) Excludes changes to Lib/email/_header_value_parser.py, which did not exist in 2.7. Co-authored-by: jpic <jpic@users.noreply.github.com> https://bugs.python.org/issue34155
jpic commentedMay 3, 2019
•
edited by miss-islington
https://bugs.python.org/issue34155
Automerge-Triggered-By: @warsaw