Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py #21626

ghost · 2021-11-10T18:41:45Z

Increased speed significantly by adding the parameters cv and n_jobs. I set cv=3 and n_jobs=-1. By setting n_jobs=-1 the available number of cpu cores is picked automatically to optimize calculations.

adrinjalali

Thanks @sveneschlbeck

adrinjalali · 2021-11-11T09:15:29Z

examples/model_selection/plot_multi_metric_evaluation.py

@@ -49,6 +49,8 @@
    param_grid={"min_samples_split": range(2, 403, 10)},
    scoring=scoring,
    refit="AUC",
+    cv=3,


I don't think we want to reduce CV to 3, especially since people tend to copy/paste code.

Did you check the effect of reducing the number of samples?

@adrinjalali Agreed, we want to keep it generic :)

Removed the cv param and decreased the sample number from 8000 to 6000: result is slightly worse but still 2X faster, so a good compromise I'd say

examples/model_selection/plot_multi_metric_evaluation.py

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

adrinjalali

LGTM, you may also check if reducing the number of samples makes much of a difference, my guess would be that n_jobs=2 is giving most of the speed up here.

ghost · 2021-11-23T18:18:36Z

@adrinjalali You are right, reducing the sample number below 6000 makes the example plot worse. Reducing from 8000 to 6000 however was good. Indeed, the n_jobs param is doing the heavy lifting here.

adrinjalali · 2022-02-04T15:38:41Z

@ogrisel wanna have a second look at this one?

jeremiedbb

Instead of reducing the number of samples, I suggest to reduce the number of min_samples_split values. That way the plot will be the same, with just a little less points.

examples/model_selection/plot_multi_metric_evaluation.py

jeremiedbb

time is now 7sec instead of 30sec. LGTM. Thanks @sveneschlbeck !

…uation.py (scikit-learn#21626) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>

Increased speed by adding cv and n_jobs params

62b4978

adrinjalali reviewed Nov 11, 2021

View reviewed changes

Update plot_multi_metric_evaluation.py

1b28aaa

ogrisel reviewed Nov 12, 2021

View reviewed changes

examples/model_selection/plot_multi_metric_evaluation.py Outdated Show resolved Hide resolved

adrinjalali changed the title ~~Increased speed by adding cv and n_jobs params~~ Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py Nov 12, 2021

Update examples/model_selection/plot_multi_metric_evaluation.py

bc6e62a

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

adrinjalali mentioned this pull request Nov 12, 2021

Accelerate slow examples #21598

Closed

41 tasks

adrinjalali approved these changes Nov 23, 2021

View reviewed changes

cmarmo added Documentation Waiting for Reviewer labels Dec 15, 2021

Merge branch 'main' into speed_increased_example_multimetric

8cb1ed8

jeremiedbb reviewed Feb 23, 2022

View reviewed changes

examples/model_selection/plot_multi_metric_evaluation.py Outdated Show resolved Hide resolved

examples/model_selection/plot_multi_metric_evaluation.py Outdated Show resolved Hide resolved

jeremiedbb added 2 commits February 23, 2022 13:31

Update examples/model_selection/plot_multi_metric_evaluation.py

4db2255

Update examples/model_selection/plot_multi_metric_evaluation.py

0246915

jeremiedbb approved these changes Feb 23, 2022

View reviewed changes

jeremiedbb merged commit 5137abf into scikit-learn:main Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py #21626

Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py #21626

ghost commented Nov 10, 2021

adrinjalali left a comment

adrinjalali Nov 11, 2021

ghost Nov 11, 2021

adrinjalali left a comment

ghost commented Nov 23, 2021

adrinjalali commented Feb 4, 2022

jeremiedbb left a comment

jeremiedbb left a comment

Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py #21626

Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py #21626

Conversation

ghost commented Nov 10, 2021

adrinjalali left a comment

Choose a reason for hiding this comment

adrinjalali Nov 11, 2021

Choose a reason for hiding this comment

ghost Nov 11, 2021

Choose a reason for hiding this comment

adrinjalali left a comment

Choose a reason for hiding this comment

ghost commented Nov 23, 2021

adrinjalali commented Feb 4, 2022

jeremiedbb left a comment

Choose a reason for hiding this comment

jeremiedbb left a comment

Choose a reason for hiding this comment