Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT use the default CPU_COUNT=2 for the macOS builds #22919

Merged
merged 3 commits into from Mar 22, 2022

Conversation

ogrisel
Copy link
Member

@ogrisel ogrisel commented Mar 22, 2022

I just noticed that the macOS builds might be slow because we use 3 pytest-xdist workers workers with just 2 CPUs for no apparent reason.

Let's try to see if using the default of 2 pytest-xdist workers improves the build time.

@thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Mar 22, 2022

According to Azure pipeline docs the Mac runners use 3 core CPUs. Looking at the threadpoolctl info, it looks like the docs could be wrong.

@jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Mar 22, 2022

Both are using mkl, which may not use all cpus (by default it does not use hyperthreads for instance). I think the most reliable would be to print joblib.cpu_count(only_physical_cores=True/False) to have a good overview of what we are given.

@ogrisel
Copy link
Member Author

@ogrisel ogrisel commented Mar 22, 2022

It seems slightly faster with CPU_COUNT=2:

image

Here is a previous build:

image

However there is a lot of variability between macOS builds so the improvement might not be significant though.

@ogrisel
Copy link
Member Author

@ogrisel ogrisel commented Mar 22, 2022

According to Azure pipeline docs the Mac runners use 3 core CPUs. Looking at the threadpoolctl info, it looks like the docs could be wrong.

Maybe it has changed and they did not keep the doc in sync.

@jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Mar 22, 2022

Number of cores: 3
Number of physical cores: 3

Hum actually the number of physical cores is probably incorrect since joblib will return the total number of cores anyway when the number of usable cores is limited. Better to just print cpu_count

@jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Mar 22, 2022

Lot of variability indeed.
ci

@ogrisel
Copy link
Member Author

@ogrisel ogrisel commented Mar 22, 2022

So indeed there are 3 cores according to joblib. But then the overall runtime does not seem to improve by using 3 pytest-xdist workers. Let's merge this PR as it is?

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Let's try, maybe will see a tendency with many runs

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Okay, lets give it a try.

@thomasjpfan thomasjpfan merged commit 9d7f5b9 into scikit-learn:main Mar 22, 2022
27 checks passed
@ogrisel ogrisel deleted the azure-macos-cpu-count branch Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants