-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-96472: Save a few bits in dicts #96473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gh-96472: Save a few bits in dicts #96473
Conversation
Objects/dictobject.c
Outdated
ix = indices[i]; | ||
if (log2size <= 8) { | ||
const uint8_t *indices = (const uint8_t*)(keys->dk_indices); | ||
const uint8_t uix = indices[i] - DKIX_LOWEST_RESERVED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change DKIX_LOWEST_RESERVED
to DKIX_RESERVED_VALUES
(you choose the name)
such that it is a positive value. Adding a positive value makes the code easier to follow than subtracting a negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had similar considerations, but was one the fence about which way is better.
So I have no strong feelings either way. I'm happy to change it to what you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if we want to move this change into main, we would also want to edit quite a few explanatory comments.
I haven't done that so far, because I wanted to get feedback on the general idea first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I'm +0 for now. This change will make debuggers complex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update Tools/gdb/libpython.py
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
At the moment, dicts vary what int datatype they use for their indices depending on their size.
As a minor complication, indices are stored as signed integers, and thus they shift to the next bigger int size at
1<<7
,1<<15
and1<<31
.However, we can double all those boundaries, and thus save one, two or four bytes per index for sizes that fall between the old and revised boundaries.
See the issue #96472 for more background and benchmarks.
P.S. This PR does some bit-fiddling to preserve the original bit-for-bit representation of
dk_indices
. Alas, for some reason, I can't seem to make that work. I think there might be some other parts of the code that is relying on the exact bit-representation indk_indices
. My initial proposal has the benefit of not changing what any of these bits mean. Especially not the bit pattern for the special entries.If we can fix that approach, I think that just adding and subtracting would probably be easier and perhaps a tiny bit faster (because of no branching) than this PR here.