Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX improve error message when computing NDCG with a single document #25672

Merged

Conversation

lene
Copy link
Contributor

@lene lene commented Feb 23, 2023

Reference Issues/PRs

Fixes #21335
closes #24482

What does this implement/fix? Explain your changes.

Check that the input array to ndcg_score() has a length greater than 1 and throw a ValueError with a meaningful error message if not.

Any other comments?

With the gracious help of @wcchu.

@lene
Copy link
Contributor Author

lene commented Feb 23, 2023

@adrinjalali @glemaitre If you'd give some feedback, it would be much appreciated!

@glemaitre glemaitre changed the title If input to ndcg_score() is degenerate (has length == 1), raise a meaningful error FIX improve error message when computing NDCG with a single document Feb 24, 2023
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need an entry in the changelog (file doc/whats_new/1.3.rst)

It should be in the metrics section and we can consider this as a Fix.

sklearn/metrics/_ranking.py Show resolved Hide resolved
sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved
sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved
sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved
sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved
sklearn/metrics/tests/test_ranking.py Outdated Show resolved Hide resolved
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The changes around L1689-1691 are not related to this PR, but I personally don't mind either way.

Please add a changelog entry in this file https://github.com/scikit-learn/scikit-learn/blob/main/doc/whats_new/v1.3.rst

@jeremiedbb jeremiedbb mentioned this pull request Mar 2, 2023
@glemaitre glemaitre self-requested a review March 14, 2023 14:25
@glemaitre
Copy link
Contributor

@glemaitre What does it mean by "more than 1 document"? Or it's more than 1 dimension and this is a typo?

NDCG is a metric to rank search request or such. A document is just the one input that you want to grade.

sklearn/metrics/_ranking.py Outdated Show resolved Hide resolved
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We miss an entry in the changelog to acknowledge the bug fix.
Could you add an entry in the file doc/whats_new/v1.3.rst.
It should be in the metrics section and be added as a Fix.

Then you can acknowledge whoever participated to solve this bug.

doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved
@glemaitre glemaitre merged commit 68f2023 into scikit-learn:main Mar 24, 2023
7 of 11 checks passed
@glemaitre
Copy link
Contributor

Thanks all. Merging. LGTM

1 similar comment
@glemaitre
Copy link
Contributor

Thanks all. Merging. LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NDCG score doesn't work with binary relevance and a list of 1 element
5 participants