Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-29410: Change the default hash algorithm to SipHash13. #28752

Merged
merged 18 commits into from Oct 10, 2021

Conversation

methane
Copy link
Member

@methane methane commented Oct 6, 2021

Lib/test/test_hash.py Outdated Show resolved Hide resolved
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
configure.ac Outdated Show resolved Hide resolved
configure.ac Outdated Show resolved Hide resolved
Copy link
Member

@tiran tiran left a comment

The removal of SipHash-2-4 should go through regular deprecation cycle.
I recommend that you keep Py_HASH_SIPHASH24, add a new constant for Py_HASH_SIPHASH13, and add deprecation for Py_HASH_SIPHASH24.

@bedevere-bot
Copy link

bedevere-bot commented Oct 7, 2021

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@methane
Copy link
Member Author

methane commented Oct 7, 2021

add deprecation for Py_HASH_SIPHASH24.

How to add deprecation to undocumented macro? Comment in header is enough?

@tiran
Copy link
Member

tiran commented Oct 7, 2021

You could add #warning Py_HASH_SIPHASH24 is deprecated to Python/pyhash.c. Or we could just leave SipHash-2-4 in there. The algorithm is also used for _Py_KeyedHash in import system.

Include/pyhash.h Outdated Show resolved Hide resolved
Include/pyhash.h Outdated Show resolved Hide resolved
Lib/test/test_hash.py Outdated Show resolved Hide resolved
configure.ac Outdated Show resolved Hide resolved
configure.ac Outdated Show resolved Hide resolved
configure.ac Outdated Show resolved Hide resolved
@@ -206,6 +206,19 @@ class StringlikeHashRandomizationTests(HashRandomizationTests):
# seed 42, 'abc'
[-678966196, 573763426263223372, -820489388, -4282905804826039665],
],
'siphash13': [
Copy link
Member Author

@methane methane Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@methane methane marked this pull request as ready for review Oct 8, 2021
@methane methane changed the title bpo-29410: Use SipHash13 instead of SipHash24. bpo-29410: Change the default hash algorithm to SipHash13. Oct 8, 2021
@methane
Copy link
Member Author

methane commented Oct 8, 2021

I have made the requested changes; please review again.

@bedevere-bot
Copy link

bedevere-bot commented Oct 8, 2021

Thanks for making the requested changes!

@tiran: please review the changes made to this pull request.

@bedevere-bot bedevere-bot requested a review from tiran Oct 8, 2021
Copy link
Member

@tiran tiran left a comment

Excellent, almost there! I have two minor and one major suggestion for the PR.

Python/pyhash.c Show resolved Hide resolved
Include/pyhash.h Outdated Show resolved Hide resolved
@@ -175,6 +175,9 @@ Other CPython Implementation Changes
support :class:`typing.SupportsComplex` and :class:`typing.SupportsBytes` protocols.
(Contributed by Mark Dickinson and Dong-hee Na in :issue:`24234`.)

* ``siphash13`` is added as a new internal hash algorithm for strings, and the default
Copy link
Member

@tiran tiran Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit pick: Your change affects str, bytes, and a couple of other places like memoryview, regular expressions, and datetime objects.

Copy link
Member Author

@methane methane Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the word "strings" because I don't have a better word to represent it. Any suggestion?

Configure script and https://docs.python.org/3/using/configure.html#security-options uses: "hash algorithm for use in Python/pyhash.c".

Copy link
Member

@tiran tiran Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would mention why we made it the default and that str and bytes are affected. Perhaps something like

siphash13 is added as a new internal hashing algorithms. It's has similar security properties as siphash24 but it is slightly faster for long inputs. str, bytes, and some other types now use it as default algorithm for hash()

@bedevere-bot
Copy link

bedevere-bot commented Oct 8, 2021

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Doc/whatsnew/3.11.rst Outdated Show resolved Hide resolved
Co-authored-by: Christian Heimes <christian@python.org>
@methane methane merged commit ad970e8 into python:main Oct 10, 2021
12 checks passed
@methane methane deleted the siphash13 branch Oct 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants