New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-96538: Move some type-checking out of bisect loops #96539
Conversation
if (res_obj == Py_NotImplemented) { | ||
Py_DECREF(res_obj); | ||
compare = NULL; | ||
res = PyObject_RichCompareBool(item, litem, Py_LT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remark: this could technically call __lt__, __lt__, __gt__
instead of the current behavior, calling __lt__, __gt__
.
This is similar to what unsafe_object_compare does though.
With sorting, which is potentially I recognize that there has been a shift in the core developer attitudes towards complexity and that many things we would have never considered now are fair game if some clock cycles can be saved. With the interpreter, there was a high payoff that compensated for fewer people being able to understand and maintain the code. There is less of case to be made for complexifying code on the periphery. It would be nice if some simple things remained simple. At one time, the whole module was basically just this code:
Tim, what are your thoughts? |
Note that The code seems pretty straightforward to me, and Dennis's timing results do show significant speedups for typical type-homogeneous cases, even for short lists. I think it would help if comments were added to point out which parts are logically necessary, and which just to speed typical cases. |
Tim, thanks for looking at this. It is not aways clear how far we should go. |
No argument here! IMO, "too many" changes go in that add too much complexity for too little gain. The changes here are conceptually simple, though, and the "1.31x faster" overall result looks credible, well-motivated, and more robust than the kinds of changes that make seemingly random changes to C spellings that yield seeming random "1.04x faster" results on some platforms, but also minor slowdowns on others |
Thanks for the reviews! |
Here's a benchmark program:
And the results: