Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX adjust inliner criterion in RANSACRegressor #19499

Merged

Conversation

@gregorystrubel
Copy link
Contributor

@gregorystrubel gregorystrubel commented Feb 19, 2021

Reference Issues/PRs

Fixes #19497.
Fixes #16147.

What does this implement/fix? Explain your changes.

This fix implement solution discussed in #16147:
inlier_mask_subset = residuals_subset <= residual_threshold
that replace the current <

This make the code work also with perfect horizontal lines (see #19497).

Basically I suggest to change the definition of what is an inlier respectivly to residual_threshold. Exactly on the threshold, I suggest to say it is an inlier and not an outliner.

This make a couple of "normal cases" work: horizontal line #19497 (currently we say all the point on a line are outlier!) and other examples described in #16147

Unfortunately This patch make a test fail, but we can keep it with the following modification. The test construct an example where there is no inlier. It is based on the fact when residual_threshold =0 an exception is always thrown. To do this in a clean way we should construct here a real example where there is no inlier. Such an example is not easy to find.
I suggest to construct a more simple test as before : set the residual_threshold to an other corner value that make always the comparison fail : a nan

I also implement the test described in #19497 (horizontal lines)

Any other comments?

Gregory Strubel added 2 commits Feb 19, 2021
add new test for horizontal line
adapt no inliner test with a synonym
@ogrisel
Copy link
Member

@ogrisel ogrisel commented Feb 23, 2021

Thanks for the fix, the detailed explanation and the new test.

Can you please check if the definition of an inlier needs to be updated in the documentation? If not it might be worth making the doc a bit more specific (either in the user guide or the docstring or both).

Can you also please document your fix in the changelog (doc/what_new/v1.0.rst)?

Gregory Strubel added 2 commits Feb 27, 2021
@gregorystrubel
Copy link
Contributor Author

@gregorystrubel gregorystrubel commented Feb 27, 2021

Thanks a lot for your review. I have updated the documentation.

@gregorystrubel gregorystrubel requested a review from ogrisel Feb 27, 2021
@cmarmo cmarmo added this to the 1.0 milestone Mar 2, 2021
Gregory Strubel added 2 commits Mar 27, 2021
Copy link
Member

@jjerphan jjerphan left a comment

Thanks for this PR @gregorystrubel! Here are a few suggestions.

doc/modules/linear_model.rst Outdated Show resolved Hide resolved
doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved
sklearn/linear_model/_ransac.py Outdated Show resolved Hide resolved
sklearn/linear_model/tests/test_ransac.py Show resolved Hide resolved
gregorystrubel and others added 6 commits Jun 10, 2021
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…strubel/scikit-learn into bugfix_ransac_residual_threshold
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@gregorystrubel
Copy link
Contributor Author

@gregorystrubel gregorystrubel commented Jun 10, 2021

Thanks @jjerphan, additionally to your suggestions, I precised "non-regression" in the comment of the non-regression test
(commit #ddacfe2)

@gregorystrubel gregorystrubel requested a review from jjerphan Jun 10, 2021
Copy link
Member

@jjerphan jjerphan left a comment

LGTM, thank you @gregorystrubel!

@glemaitre glemaitre changed the title Bugfix ransac residual threshold + horizontal line FIX adjust inliner criterion in RANSACRegressor Jul 30, 2021
doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved
doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved
doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved
sklearn/linear_model/tests/test_ransac.py Outdated Show resolved Hide resolved
sklearn/linear_model/tests/test_ransac.py Outdated Show resolved Hide resolved
@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Jul 30, 2021

I merged my small suggestions that were only typo. We will check if the tests are passing and we will be able to merge.

@glemaitre glemaitre merged commit 0b45ac5 into scikit-learn:main Jul 30, 2021
28 checks passed
@gregorystrubel gregorystrubel deleted the bugfix_ransac_residual_threshold branch Aug 13, 2021
samronsin added a commit to samronsin/scikit-learn that referenced this issue Nov 30, 2021
Co-authored-by: Gregory Strubel <greg@Air-de-Ali.local>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

5 participants