Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-42825: Enable /OPT:REF #24098

Merged
merged 1 commit into from Feb 19, 2021
Merged

Conversation

Austin-Lamb
Copy link
Contributor

@Austin-Lamb Austin-Lamb commented Jan 4, 2021

This PR enables the /OPT:REF linker optimization on Windows, allowing the linker to discard uncalled code. This is analogous to --gc-sections for ld on linux, if you're familiar with that. Basically the linker can determine that a function is never called and not include it in the final binary.

Note that I had to explicitly keep /OPT:ICF disabled because enabling Identical COMDAT Folding (ICF) causes test failures. This is because Python310.dll depends on some identical things not folding together (such as wrap_binary_func and wrap_binary_func_l) as their addresses are compared to determine equality in some important places. There are even helpful tests that verify COMDAT folding is disabled which helped me catch this 😊.

This could be backported to previous versions if desired, it should be simple and safe to do to whichever versions are deployed at scale in the wild, but I'll leave that to the community and maintainers to decide, I didn't want to go through the backport process.

Note: I am not sure how to build for PGO locally, but I think it's good to have this on for Configuration == PGInstrument and/or PGUpdate, so I used a Condition of "!= Debug". This should be verified in the real build pipeline that does PGO, or let me know how I can do it locally.

In terms of the size savings, here's a few example binaries and then the total across all of them in the build output folder - all measurements were done for amd64 Release binaries. It's over a 10% savings in the size of everything, and some binaries are considerably more than that.

Binary Size before Size after Delta (%)
py.exe 1,024,000 735,744 -288,256 (-28.2%)
python.exe 93,696 91,136 -2,560 (-2.7%)
python_uwp.exe 225,792 140,288 -85,504 (-37.9%)
python310.dll 5,268,992 4,753,408 -515,584 (-9.8%)
sqlite3.dll 1,505,280 1,376,256 -129,024 (-8.6%)
Total of .exe/.dll/.pyd 21,632,408 19,278,744 -2,353,664 (-10.9%)

https://bugs.python.org/issue42825

…causes test failures as Python310.dll depends on some identical things not folding together (such as wrap_binary_func and wrap_binary_func_l).
@Austin-Lamb
Copy link
Contributor Author

Austin-Lamb commented Jan 19, 2021

@zooba - anything I should do to help this get reviewed?

Thanks!

@github-actions
Copy link

github-actions bot commented Feb 19, 2021

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Feb 19, 2021
@zooba zooba merged commit b4af629 into python:master Feb 19, 2021
3 checks passed
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
We explicitly disable /OPT:ICF as some manual optimisations depend on some functions still having distinct pointers (such as wrap_binary_func and wrap_binary_func_l).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip news stale Stale PR or inactive for long period of time.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants