Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Partial AUC #3840

Merged
merged 3 commits into from Mar 1, 2018
Merged

[MRG+1] Partial AUC #3840

merged 3 commits into from Mar 1, 2018

Conversation

Alexander-N
Copy link
Contributor

@Alexander-N Alexander-N commented Nov 7, 2014

This needs better tests, any ideas how to make them?

Issue #3273
partial AUC: integrate the ROC curve until chosen maximal FPR and standardize it to lie between 0.5 and 1.

@jnothman jnothman changed the title [WIP] adress Issue #3273 [WIP] Partial AUC Nov 8, 2014
@raghavrv raghavrv mentioned this pull request Feb 3, 2015
@lesshaste
Copy link

@lesshaste lesshaste commented Sep 1, 2017

This seems like a useful addition. What happened to it?

@albertcthomas
Copy link
Contributor

@albertcthomas albertcthomas commented Sep 21, 2017

This could be useful for anomaly/outlier/novelty detection as well as we are especially interested in the performance of the scoring function (e.g. decision_function) in the regions corresponding to small FPR values.

@albertcthomas albertcthomas mentioned this pull request Sep 21, 2017
@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Sep 21, 2017

I stopped working on this since I wasn't sure how to test this properly. If anybody could give me some hints I would pick it up again.

@@ -138,6 +138,12 @@ def test_roc_curve():
assert_equal(fpr.shape, tpr.shape)
assert_equal(fpr.shape, thresholds.shape)

# test partial ROC
Copy link
Member

@jnothman jnothman Sep 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not belong in test_roc_curve.

Copy link
Member

@jnothman jnothman Sep 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a new function

y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0, 0.1, 0.01])
assert_equal(roc_auc_score(y_true, y_scores, max_fpr=0.3), 0.5)
assert_equal(roc_auc_score(y_true, y_true, max_fpr=0.7), 1)
Copy link
Member

@jnothman jnothman Sep 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This varies in both data and max_fpr from the previous line. A better test would consider minimal variations to the parameter and the data and show that the changes affect the metric in the expected way. I'd demonstrate for instance that max_fpr=1 equates to leaving it out, then reduce it.

@serv-inc
Copy link

@serv-inc serv-inc commented Jan 31, 2018

Works fine with 0.18.2. File is ranking.py.txt

How can this be brought to a finish?

@Alexander-N: would you prefer if you edited some testing code into this, or are you no longer interested? @jnothman probably meant that you should add a new function to test_ranking.py with a few different test cases.

  • result for max_fpr=1 is the same as not setting the parameter
  • results for max_fpr=0.9 as compared to max_fpr=1, etc

@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Jan 31, 2018

Sorry for being unresponsive. I will look into it this weekend.

@Alexander-N Alexander-N changed the title [WIP] Partial AUC [MRG] Partial AUC Feb 4, 2018
@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Feb 22, 2018

I addressed the comments, this is ready for further review. Just wanted to make that clear to attract reviewers :-).

Copy link
Member

@jnothman jnothman left a comment

Can you add this as a metric for common testing in metrics/tests/test_common.py? This way you can check its basic properties and invariances.

@@ -257,6 +258,9 @@ def roc_auc_score(y_true, y_score, average="macro", sample_weight=None):
sample_weight : array-like of shape = [n_samples], optional
Sample weights.
max_fpr : float, optional If not ``None``, the standardized partial AUC
over the range [0, max_fpr] is returned.
Copy link
Member

@jnothman jnothman Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include a citation of the reference

Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please insert a newline

max_fpr : float, optional
    If not ``None``, the standardized partial AUC over the range [0, max_fpr] is returned.

Also, maybe add here that it should be in [0, 1] and raise a ValueError in the code if it's not?

if max_fpr is None or max_fpr == 1:
return auc(fpr, tpr)

idx = np.where(fpr <= max_fpr)[0]
Copy link
Member

@jnothman jnothman Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fpr is increasing, so it would be clearer if we just did something like:

stop = np.searchsorted(fpr, max_fpr, 'right')

and below used tpr[:stop] instead of tpr[idx] and avoided having both names idx_in and idx_out

return auc(fpr, tpr)

idx = np.where(fpr <= max_fpr)[0]
# Linearly interpolate the ROC curve until max_fpr
Copy link
Member

@jnothman jnothman Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this should be clearer to say we're just adding a single point at max_fpr by interpolation

Copy link
Contributor

@albertcthomas albertcthomas left a comment

A partial review. For the tests, I was told very recently by @jnothman that we should use assert instead of assert_equal to check equality.

@@ -257,6 +258,9 @@ def roc_auc_score(y_true, y_score, average="macro", sample_weight=None):
sample_weight : array-like of shape = [n_samples], optional
Sample weights.
max_fpr : float, optional If not ``None``, the standardized partial AUC
over the range [0, max_fpr] is returned.
Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please insert a newline

max_fpr : float, optional
    If not ``None``, the standardized partial AUC over the range [0, max_fpr] is returned.

Also, maybe add here that it should be in [0, 1] and raise a ValueError in the code if it's not?

return auc(fpr, tpr)
if max_fpr is None or max_fpr == 1:
return auc(fpr, tpr)

Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about adding

if max_fpr == 0:
    return 0

as the AUC is equal to 0 when max_fpr = 0

Copy link
Contributor Author

@Alexander-N Alexander-N Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since setting max_fpr to zero makes no sense I think it might be best to raise a ValueError.

for max_fpr in np.linspace(0, 1, 5):
assert_almost_equal(
roc_auc_score(y_true, y_pred, max_fpr=max_fpr),
_partial_roc_auc_score(y_true, y_pred, max_fpr))
Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test the behavior when max_fpr is not in [0, 1]. If we raise a ValueError, then test that the ValueError is raised with an informative message.

y_true = np.array([0, 0, 1, 1])
assert_equal(roc_auc_score(y_true, y_true, max_fpr=1), 1)
assert_equal(roc_auc_score(y_true, y_true, max_fpr=0.001), 1)
assert_equal(np.isnan(roc_auc_score(y_true, y_true, max_fpr=0)), True)
Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use

assert np.isnan(roc_auc_score(y_true, y_true, max_fpr=0))

Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would be in favor of returning 0 when max_fpr=0 (see my comment above). WDYT?

fpr = np.append(fpr[idx], max_fpr)
partial_auc = auc(fpr, tpr)

# McClish correction: standardize result to lie between 0.5 and 1
Copy link
Contributor

@albertcthomas albertcthomas Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the computation of min_area assumes that the estimator is always better than random guess. In practice, for a very bad estimator (doing worse than random guess) partial_auc could be less than min_area and the result will not necessarily be in [0.5, 1]. It will always be lower than 1 but can be less than 0.5 which would mean that we are doing worse than a random guess estimator. I think we should keep this standardization but we should explain somewhere that the result is not necessarily in [0.5, 1].

@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Feb 25, 2018

Great comments, thanks! Except returning 0 when max_fpr is 0, everything done as suggested.

Copy link
Contributor

@albertcthomas albertcthomas left a comment

I would have returned 0 when max_fpr is 0, even if it does not make sense, that's what the output should be in theory but LGTM. Thanks @Alexander-N!

Copy link
Member

@jnothman jnothman left a comment

Nice work. LGTM

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 25, 2018

Please add an entry to the change log at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 25, 2018

max_fpr=0 might have a theoretically value, but it is a nonsense setting for an evaluation metric. Error is good.

@jnothman jnothman changed the title [MRG] Partial AUC [MRG+1] Partial AUC Feb 25, 2018
@albertcthomas
Copy link
Contributor

@albertcthomas albertcthomas commented Feb 25, 2018

Fair enough.

@Alexander-N Alexander-N force-pushed the partial-AUC branch 2 times, most recently from 1d16f37 to 089f58d Compare Feb 26, 2018
@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Feb 26, 2018

Ok, done.

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 26, 2018

Please add cmmots rather than amending to make it easier to review the changes

@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Feb 26, 2018

Ah ok sorry. I amended to correct the original commit message. Only change is the addition of the entry in doc/whats_new/v0.20.rst.

Copy link
Member

@jnothman jnothman left a comment

I was going to merge this, but I think it deserves a mention in doc/modules/model_evaluation.rather when discussing the auc score. Try to help users understand why this may be a better/worse metric for their task.

partial AUC: Integrate the ROC curve until chosen maximal FPR and
standardize the result to be 0.5 if non-discriminant and 1 if maximal.

Closes scikit-learn#3273
@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Feb 28, 2018

I could add something along the lines of

In applications where a high false positive rate is not tolerable the parameter ``max_fpr`` can be used to summarize the ROC curve up to the given limit.

I feel that this might not be very helpful. Do you have something specific in mind?

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 28, 2018

@Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented Mar 1, 2018

Ok, done.

jnothman
Copy link
Member

@jnothman jnothman commented on c6a38d9 Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had been suggesting you put "one critique" first as a way to improve cohesion. Now it reads a bit strangely. (And I didn't really like my own wording here)

Perhaps you should just drop my words and leave your own.

Alexander-N
Copy link
Contributor Author

@Alexander-N Alexander-N commented on c6a38d9 Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, Done :-)

jnothman
Copy link
Member

@jnothman jnothman commented on c6a38d9 Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@jnothman jnothman left a comment

Thanks

@jnothman jnothman merged commit 02ddc70 into scikit-learn:master Mar 1, 2018
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants