-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-91247: improve performance of list and tuple repeat (with specialization for n=1) #91482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-91247: improve performance of list and tuple repeat (with specialization for n=1) #91482
Conversation
05e2b70
to
5445cfb
Compare
@sweeneyde Could you have a look at this ticket? The approach is similar to #31856 and #31999 which you reviewed. |
1491705
to
66cc8ae
Compare
@eendebakpt Sorry this slipped under my radar @vstinner, would you mind reviewing? I think the change is good in principle but I want another reviewer and to make sure the header file changes are appropriate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Misc/NEWS.d/next/Core and Builtins/2022-03-22-13-12-27.bpo-47091.tJcy-P.rst
Outdated
Show resolved
Hide resolved
…91.tJcy-P.rst Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nits, but this is looking good. Thanks for your patience and persistence on this.
Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
I verified a benchmark on a Windows machine since this might be a bit compiler-specific.
Looks great! |
🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit 516f091 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
*dest++ = *src++; | ||
} | ||
|
||
_Py_memory_repeat((char *)np->ob_item, sizeof(PyObject *)*output_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this cannot overflow because list_new_prealloc
uses PyMem_New, and so this multiplication already succeeded at some point.
Compared to current main branch there are 3 improvements
This PR keeps the specializations for repeats of lists and tuples with length 1. An alternative PR without specialization for n=1 (#32045) was rejected because the specializations might yield a speedup.
For performance comparisons to main and the version without specializations and more detailed discussion see issue #91247.
The inefficient pointer chasing can be seen in the following performance benchmark:
(results obtained with
runner.timeit(name=f"list({n}) repeat {r}", stmt=f"x=a*{r}", setup=f'a=list(range({n}))')
)