New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT Consistent cython types from _typedefs #25942
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @jeremiedbb.
I just have few comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for moving forward with the concrete implementation of this RFC. Here are some feedbacks from a first quick scan:
sklearn/utils/tests/test_typedefs.py
Outdated
@pytest.mark.parametrize( | ||
"dtype_t, value, expected_dtype", | ||
[ | ||
("bool_t", 1, np.uint8), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool_t
was probably a bad choice since there's no np.bool
. We probably should have named it uint8_t
. It's only used in hierarchical_fast so I can change it quickly.
@jjerphan, you mentionned that during the review of a previous PR where I changed the uses of int8
in _hierarchical_fast. With this new context, I changed my mind and now think that we should use ìnt8
consistently. What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not remember exactly, do you mean uint8
or do you mean int8
?
I think we tend to use uint8
in the code-base generally for booleans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current proposal for boolean on this line looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean uint8 or do you mean int8?
I meant uint8
sorry. I was wondering if we should define ctypedef unsigned char uint8_t
instead of bool_t
since numpy does not define a bool_t
type and to make the name match between to compile-time type and the runtime dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for a more precise uint8_t
type alias instead of bool_t
which can be misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking again, I have a slight preference for uint8_t
over bool_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the last comments regarding uint8_t
and bool_t
are treated.
Thank you, @jeremiedbb.
sklearn/utils/tests/test_typedefs.py
Outdated
@pytest.mark.parametrize( | ||
"dtype_t, value, expected_dtype", | ||
[ | ||
("bool_t", 1, np.uint8), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking again, I have a slight preference for uint8_t
over bool_t
.
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the following is addressed, this PR looks good to me!
sklearn/utils/_typedefs.pyx
Outdated
@@ -1,20 +1,19 @@ | |||
#!python | |||
# _typedefs is a declaration only module | |||
# The functions implemented here are for testing purpose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rename this _typedefs.pyx
file to _typedefs_testing.pyx
or _typedefs_test_helper.pyx
(and update the build setup and test accordingly) to avoid any confusion. In particular I wouldn't want contributors to be confused as to when to se the type_t
fused type (the answer should be "never").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you can't have a pxd only, it would mean we have an empty _typedefs.pyx and an additional _typedefs_testing.pyx. As discussed irl, I just made it more explicit that the definitions in the pyx are for testing purpose only.
Thanks! |
Towards #25572
Follow-up of #25810
This PR replaces the last occurences of old defined types from utils._typedefs.