Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37000: Remove obsolete comment in _randbelow_with_getrandbits #95775

Merged

Conversation

matthiasgoergens
Copy link
Contributor

@matthiasgoergens matthiasgoergens commented Aug 8, 2022

In random._randbelow_with_getrandbits we are generating a random integer
in [0,n). We do that by repeatedly drawing a random number, r, in [0,2**k)
until we hit an r < n.

That's a great strategy. It's not only very simple, but even with worst
case input, we only need to draw 2 random numbers on average, before we
hit the jackpot.

However, the code is a bit overly cautious. When n is a power of 2, we
ask getrandbits to sample from [0,2*n]. The code justifies that
strange behaviour with a comment expressing fear of the special case n==1.

That fear is unfounded: the obvious code works just fine for n==1.

Fixing this not only makes the code simpler to understand, it also
guarantees that the important special case of n being a power of two now
has a guaranteed worst case runtime of only a single call to
getrandbits; instead of an expected runtime of two calls with no worst
case bound.

Existing tests already cover this code and the special cases of
interest. So we don't need any new tests.

This minor performance bug was introduced in
0515661 as far as I can tell. But I
can't really tell what the original author was thinking.

Since this is a fairly trivial change, I did not create an issue. (But I am happy to create one, if you think it's better this way.)

I discovered this problem while working on https://discuss.python.org/t/random-choice-from-a-dict/17834

Automerge-Triggered-By: GH:rhettinger

In random._randbelow_with_getrandbits we are generating a random integer
in [0,n).  We do that by repeatedly drawing a random number, r, in [0,2**k)
until we hit an r < n.

That's a great strategy.  It's not only very simple, but even with worst
case input, we only need to draw 2 random numbers on average, before we
hit the jackpot.

However, the code is a bit overly cautious.  When n is a power of 2, we
ask `getrandbits` to sample from [0,2*n].  The code justifies that
strange behaviour with a comment expressing fear of the special case n==1.

That fear is unfounded: the obvious code works just fine for n==1.

Fixing this not only makes the code simpler to understand, it also
guarantees that the important special case of n being a power of two now
has a guaranteed worst case runtime of only a single call to
getrandbits; instead of an expected runtime of two calls with no worst
case bound.

Existing tests already cover this code and the special cases of
interest.  So we don't need any new tests.

This minor performance bug was introduced in
0515661 as far as I can tell.  But I
can't really tell what the original author was thinking.
@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Aug 8, 2022

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@cpython-cla-bot
Copy link

@cpython-cla-bot cpython-cla-bot bot commented Aug 8, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

@tim-one
Copy link
Member

@tim-one tim-one commented Aug 8, 2022

Before Python 3.9, getrandbits() did not accept 0 as an argument. That's why using n-1 was a problem when n was 1.

But there's a different issue now: as the failing tests show, while getrandbits(0) works fine now, using n-1 can change results too, when n is a power of 2. We're generally loathe to change what random methods return across releases, without truly compelling reasons (reproducing results after explicitly setting a seed is a major use case).

@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Aug 8, 2022

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@matthiasgoergens
Copy link
Contributor Author

@matthiasgoergens matthiasgoergens commented Aug 8, 2022

Thanks for the quick review, @tim-one! You make a very good point.

If you are interested, I have restored the original behaviour and included your reasoning to explain this seemingly unusual behaviour to the next person reading this code.

If you think that might be useful, I can polish the description etc to make this merge-able?

@mdickinson
Copy link
Member

@mdickinson mdickinson commented Aug 8, 2022

See also previous discussion at #81181.

@rhettinger rhettinger changed the title Fix corner case in _randbelow_with_getrandbits Edit comment in _randbelow_with_getrandbits Aug 8, 2022
Lib/random.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Aug 8, 2022

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@matthiasgoergens matthiasgoergens changed the title Edit comment in _randbelow_with_getrandbits Remove misleading comment in _randbelow_with_getrandbits Aug 8, 2022
For efficiency, we'd like to use k = (n-1).bit_length() here, but
before Python 3.9, `getrandbits()` did not accept 0 as an argument.
That's why using `n-1` was a problem when `n` was 1.

Now we are stuck with this version, because we want to reproduce
results after explicitly setting a seed even between releases.
@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Aug 8, 2022

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@matthiasgoergens matthiasgoergens changed the title Remove misleading comment in _randbelow_with_getrandbits Remove obsolete comment in _randbelow_with_getrandbits Aug 8, 2022
@matthiasgoergens matthiasgoergens changed the title Remove obsolete comment in _randbelow_with_getrandbits bpo-37000: Remove obsolete comment in _randbelow_with_getrandbits Aug 8, 2022
Comment changes don't warrant a comment.
@bedevere-bot
Copy link

@bedevere-bot bedevere-bot commented Aug 8, 2022

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@rhettinger rhettinger added the 🤖 automerge PR will be merged once it's been approved and all CI passed label Aug 8, 2022
@miss-islington
Copy link
Contributor

@miss-islington miss-islington commented Aug 8, 2022

@matthiasgoergens: Status check is done, and it's a failure .

@rhettinger rhettinger merged commit 8a55e2f into python:main Aug 8, 2022
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖 automerge PR will be merged once it's been approved and all CI passed skip issue skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants