Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Enhancement: Add MAPE as an evaluation metric #10711

Closed
wants to merge 55 commits into from
Closed

[MRG] Enhancement: Add MAPE as an evaluation metric #10711

wants to merge 55 commits into from

Conversation

@mohamed-ali
Copy link
Contributor

@mohamed-ali mohamed-ali commented Feb 26, 2018

Reference Issues/PRs

Fixes #10708
Closes #6605

What does this implement/fix? Explain your changes.

  • Implements sklearn.metrics.mean_absolute_percentage_error
  • Implements the associated neg_mape scorer for regression problems.
  • Includes tests configurations in tests/test_common.py and tests/test_score_objects.py
  • Add specific tests with y_true that doesn't include zeros in tests/test_common.py and tests/test_score_objects.py.
  • Adds docstring + example.
  • Updates the documentation at doc/modules/model_evaluation.rst and doc/modules/classes.rst

Any other comments?

  • We have to reach a consensus on the name of the scorer: either neg_mean_absolute_percentage_error_scorer or neg_mape_scorer
Copy link
Member

@jnothman jnothman left a comment

I think you should note the sensitivity to small y_true, and perhaps we need to validate that y_true is non-zero.

Thanks for the thoroughness!

@@ -85,6 +85,7 @@ Scoring Function
**Regression**
'explained_variance' :func:`metrics.explained_variance_score`
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error`
'neg_mean_absolute_percentage_error' :func:`metrics.mean_absolute_percentage_error`

This comment has been minimized.

@jnothman

jnothman Feb 26, 2018
Member

@amueller suggested neg_mape here. I'm not entirely sure which is better. But you'll need to widen the table if we go with this name!

This comment has been minimized.

@mohamed-ali

mohamed-ali Feb 26, 2018
Author Contributor

Yes, I am open to change the name or keep it as is. But I'll the necessary changes once we agree on a name ;)

This comment has been minimized.

@mohamed-ali

mohamed-ali Feb 27, 2018
Author Contributor

@jnothman @amueller I am going to switch from the scorer fname neg_mean_absolute_percentage_error to the neg_mape since MAPE is already a famous acronym in the community and also because it's the only way to avoid pep8 problems.

@@ -104,7 +105,7 @@ Usage examples:
>>> model = svm.SVC()
>>> cross_val_score(model, X, y, scoring='wrong_choice')
Traceback (most recent call last):
ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'balanced_accuracy', 'brier_score_loss', 'completeness_score', 'explained_variance', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'mutual_info_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'normalized_mutual_info_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'v_measure_score']
ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'balanced_accuracy', 'brier_score_loss', 'completeness_score', 'explained_variance', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'mutual_info_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_absolute_percentage_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'normalized_mutual_info_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'v_measure_score']

This comment has been minimized.

@jnothman

jnothman Feb 26, 2018
Member

You have inspired #10712. Thanks!

if y_type == 'continuous-multioutput':
raise ValueError("Multioutput not supported "
"in mean_absolute_percentage_error")
return np.average(np.abs((y_true - y_pred)/y_true))*100

This comment has been minimized.

@jnothman

jnothman Feb 26, 2018
Member

space around / and * please

This comment has been minimized.

@jnothman

jnothman Feb 26, 2018
Member

Call mean rather than average just to match the name

@sklearn-lgtm
Copy link

@sklearn-lgtm sklearn-lgtm commented Feb 26, 2018

This pull request introduces 2 alerts when merging 7c516ef into 2e30df3 - view on lgtm.com

new alerts:

  • 1 for Explicit export is not defined
  • 1 for Implicit string concatenation in a list

Comment posted by lgtm.com

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Feb 27, 2018

@jnothman, @amueller, should I raise an error when y_true contains zeros, or do you suggest handling it differently? if so how?

Also, if you have a convention for the error division error message, please suggest it so I can use it, Otherwise, I can define a generic message.

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 27, 2018

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Feb 27, 2018

@amueller, @jnothman, @qinhanmin2014

The test scenarios in:

  • sklearn/metrics/tests/test_score_objects.py
  • sklearn/metrics/tests/test_common.py

generate a y_true sample with zeros , which fails the following check in metrics.mean_absolute_percentage_error :

    if (y_true == 0).any():
        raise ValueError("mean_absolute_percentage_error requires"
                         " y_true to never be zero")

this explains the failures in Travis.CI, could you please suggest what to do?

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 27, 2018

Pep8 consistency is a very poor excuse for naming inconsistency. I'm unsure what to do with the name, but pep8 is the last of our concerns (stick # noqa at the end of the line and Travis won't fail)

@jnothman
Copy link
Member

@jnothman jnothman commented Feb 27, 2018

You might need to add a new category in the common tests for regression metrics that require non-zero y, and update y in the tests accordingly.

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Feb 27, 2018

@jnothman thanks for the tip (# noqa) about travis that's helpful.
regarding the name, I think it's easy to change once we agree on one. So, I'll prioritize fixing the tests.

# matrix instead of a number. Testing of
# confusion_matrix with sample_weight is in
# test_classification.py
"confusion_matrix", # Left this one here because the tests in this file do

This comment has been minimized.

@jnothman

jnothman Feb 28, 2018
Member

Please do not change unrelated things. It makes your contribution harder to review and may introduce merge conflicts to other pull requests.

This comment has been minimized.

@mohamed-ali

mohamed-ali Feb 28, 2018
Author Contributor

Ok, my bad. I used autopep8 on the file.

This comment has been minimized.

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Mar 27, 2018

@lesteve travis build finished successfully, so there is no regression with the new changes that you pushed. Thanks!

Copy link
Member

@lesteve lesteve left a comment

A few comments.

Just curious, maybe for @amueller who opened the associated issue, Wikipedia page does not seem very nice with MAPE, e.g.:
"Although the concept of MAPE sounds very simple and convincing, it has major drawbacks in practical application". Is is still used despite its drawbacks?

@@ -85,6 +85,7 @@ Scoring Function
**Regression**
'explained_variance' :func:`metrics.explained_variance_score`
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error`
'neg_mape' :func:`metrics.mean_absolute_percentage_error`

This comment has been minimized.

@lesteve

lesteve Mar 28, 2018
Member

Why not spell it out fully here since like all the other metrics? i.e. neg_mean_absolute_percentage_error

This comment has been minimized.

@mohamed-ali

mohamed-ali Mar 28, 2018
Author Contributor

@lesteve I clarified in the PR description above that the name has to be chosen/voted by all of us. Initially I used neg_mean_absolute_percentage_error but then, since mape is already a famous acronym which, also, makes the metric cleaner, I chose to switch to neg_mape. However, we can change back to the long version, If most of us think that's the right thing to do.

This comment has been minimized.

@lesteve

lesteve Mar 29, 2018
Member

I would be in favour of neg_mean_absolute_error version personally. It is more consistent with neg_mean_absolute_error and more consistent with the metric name ( metrics.mean_absolute_percentage_error). Happy to hear what others think.

This comment has been minimized.

@ogrisel

ogrisel Feb 20, 2020
Member

I would also be in favor using the explicit expanded name by default and introduce neg_mape as an alias as we do for neg_mse.

This comment has been minimized.

@ogrisel

ogrisel Feb 27, 2020
Member

Actually we do not have neg_mse. I thought we had.

@@ -91,6 +91,8 @@ Model evaluation
- Added the :func:`metrics.balanced_accuracy_score` metric and a corresponding
``'balanced_accuracy'`` scorer for binary classification.
:issue:`8066` by :user:`xyguo` and :user:`Aman Dalmia <dalmia>`.
- Added the :func:`metrics.mean_absolute_percentage_error` metric and the associated

This comment has been minimized.

@lesteve

lesteve Mar 28, 2018
Member

"Model evaluation" is not the best section for this I would say, in doc/whats_new/v0.19.0.rst there is a "Metrics" section. I think you can do the same here.

This comment has been minimized.

@mohamed-ali

mohamed-ali Mar 28, 2018
Author Contributor

Nice catch, I will do that.

@@ -487,6 +489,9 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,
mean_absolute_error_scorer = make_scorer(mean_absolute_error,
greater_is_better=False)
mean_absolute_error_scorer._deprecation_msg = deprecation_msg
neg_mape_scorer = make_scorer(mean_absolute_percentage_error,
greater_is_better=False)

This comment has been minimized.

@lesteve

lesteve Mar 28, 2018
Member

Maybe remove the new line here. When there is no clear rule, my advice would be to follow the same implicit convention as the code you are changing.

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Mar 28, 2018

@amueller can you approve if all the changes that you requested have been addressed? Thanks!

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Mar 29, 2018

@jnothman, @amueller, could you please cast you vote on whether to name the scorer neg_mean_absolute_percentage_error_scorer or neg_mape_scorer. So far @lesteve is for the long version whereas I am slightly inclined towards the shorter one. What do you think?

@jnothman
Copy link
Member

@jnothman jnothman commented Mar 29, 2018

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Apr 5, 2018

@lesteve @amueller, after @jnothman comment I assume there is no need to change the name. So I guess my work is done on this issue. @amueller @lesteve could you approve the PR?
Thanks.

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented May 16, 2019

@jnothman @amueller Is this PR still considered for merging? if so, let me know so I fix the conflicts. Thanks

@jnothman
Copy link
Member

@jnothman jnothman commented May 21, 2019

I don't see any reason this shouldn't be considered for merge, @mohamed-ali.

I think the docs should emphasise a little more that the metric is not reported as a percentage here.

@amueller
Copy link
Member

@amueller amueller commented Aug 16, 2019

would you mind fixing the conflicts please?

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Aug 16, 2019

would you mind fixing the conflicts please?

Yes sure, I'll spend some time in the next two days to fix them. Thanks for letting me know.

@amueller
Copy link
Member

@amueller amueller commented Aug 16, 2019

thanks and sorry about the delay in reviewing

@amueller
Copy link
Member

@amueller amueller commented Aug 22, 2019

hm looks like there's a bunch of errors now :-/

@mohamed-ali
Copy link
Contributor Author

@mohamed-ali mohamed-ali commented Aug 23, 2019

hm looks like there's a bunch of errors now :-/

Yes, many things changed since this PR was created. I'll spend more time debugging. If not, I might recreate the PR with a new branch from master.

@ogrisel
Copy link
Member

@ogrisel ogrisel commented Mar 4, 2020

Closing in favor of #15007.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Andy's pets
PR phase
Linked issues

Successfully merging this pull request may close these issues.

7 participants