Skip to content

ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by … #21671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 20, 2021
Merged

ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by … #21671

merged 8 commits into from
Nov 20, 2021

Conversation

Iglesys347
Copy link
Contributor

…dividing by 2 parameters n_samples, n_features, rank

Reference Issues/PRs

References #21598

What does this implement/fix? Explain your changes.

Increase the execution speed of scikit-learn/examples/decomposition/plot_pca_vs_fa_model_selection.py by reducing some parameters.

Time taken before modification (time taken measured with the time unix command):

real 1m49,943s
user 4m6,328s
sys 2m52,386s

Time taken after modification:

real 0m14,672s
user 0m23,755s
sys 0m19,608s

Graphs before modification:

base_fig1
base_fig2

Graphs after modification:

new_fig1
new_fig2

Any other comments?

…dividing by 2 parameters n_samples, n_features, rank
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the speed improvement. The conclusions in the text still hold with this dataset.

In the text, I believe that the sentence "Under appropriate circumstances the low rank models are more likely than shrinkage models." is wrong. I should rather be written: "Under appropriate circumstances (choice of the number of components), the held-out data is more likely for low rank models than for shrinkage models.".

This is because the likelihood is a property of the data for a given model rather than a property of a model for given data. Maybe @agramfort or someone else can double check the suggested phrasing above.

@adrinjalali adrinjalali mentioned this pull request Nov 15, 2021
41 tasks
@agramfort
Copy link
Member

@ogrisel I agree with the phrasing you suggest

@Iglesys347
Copy link
Contributor Author

@ogrisel @agramfort Should I change this myself in this PR ?

@agramfort
Copy link
Member

agramfort commented Nov 17, 2021 via email

@Iglesys347
Copy link
Contributor Author

@ogrisel @agramfort I applied your suggestions and the pipeline seems good.

@agramfort agramfort merged commit 248f6cf into scikit-learn:main Nov 20, 2021
@agramfort
Copy link
Member

Thx @Iglesys347

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 22, 2021
scikit-learn#21671)

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 29, 2021
scikit-learn#21671)

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment
samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021
scikit-learn#21671)

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021
scikit-learn#21671)

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment
glemaitre pushed a commit that referenced this pull request Dec 25, 2021
#21671)

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment

* ENH Impoving execution speed of plot_pca_vs_fa_model_selection.py by dividing by 2 parameters n_samples, n_features, rank

* Update docstring according to @ogrisel 's comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants