Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] SimpleImputer skipped feature warning with feature names #21617

Conversation

chritter
Copy link
Contributor

@chritter chritter commented Nov 10, 2021

Reference Issues/PRs

Fixes #21616.

What does this implement/fix? Explain your changes.

If feature names are provided SimpleImputer will warn about skipped features with actual feature names. Previously only indices were provided. I introduced an additional test to ensure warning is raised correctly.

Any other comments?

#DataUmbrella Sprint
@reshamas

@chritter chritter changed the title Simple imputer transform deprecated improved message [WIP] Simple imputer transform deprecated improved message Nov 10, 2021
@chritter chritter changed the title [WIP] Simple imputer transform deprecated improved message [WIP] SimpleImputer skipped feature warning with feature names Nov 10, 2021
@chritter
Copy link
Contributor Author

@chritter chritter commented Nov 10, 2021

Please clarify if a changelog entry is required. Thanks!

@chritter chritter changed the title [WIP] SimpleImputer skipped feature warning with feature names [MRG] SimpleImputer skipped feature warning with feature names Nov 10, 2021
Copy link
Member

@ogrisel ogrisel left a comment

I think this enhancement deserves a changelog entry.

sklearn/impute/_base.py Outdated Show resolved Hide resolved
@jjerphan jjerphan self-requested a review Nov 21, 2021
Copy link
Member

@ogrisel ogrisel left a comment

This PR is a net usability improvement so +1 for merge as it is.

However by reviewing the test, I now realize that I would have expected to get such a warning (or maybe even a ValueError?) at fit time rather than just transform time.

What do you think @chritter and others? If you agree please feel free to create a new issue / follow-up PR once this is merged.

Copy link
Member

@jjerphan jjerphan left a comment

LGTM. Thank you, @chritter.

@ogrisel: I also agree that having a warning or a error raised at fit time would have been better. Yet, might this introduce unwanted and potentially harmful behavior breaking change?

I would be prudent/conservative by keeping the behavior identical but improving the warning explicitness as proposed by this PR.

sklearn/impute/_base.py Show resolved Hide resolved
doc/whats_new/v1.1.rst Outdated Show resolved Hide resolved
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@ogrisel ogrisel merged commit 08af527 into scikit-learn:main Dec 1, 2021
10 of 12 checks passed
@ogrisel
Copy link
Member

@ogrisel ogrisel commented Dec 1, 2021

Merged! Thanks @chritter. Feel free to open a follow-up PR to raise the warning at fit time instead of transform time.

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this issue Dec 1, 2021
…t-learn#21617)


Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants