Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
This PR enables the /OPT:REF linker optimization on Windows, allowing the linker to discard uncalled code. This is analogous to --gc-sections for ld on linux, if you're familiar with that. Basically the linker can determine that a function is never called and not include it in the final binary.
Note that I had to explicitly keep /OPT:ICF disabled because enabling Identical COMDAT Folding (ICF) causes test failures. This is because Python310.dll depends on some identical things not folding together (such as wrap_binary_func and wrap_binary_func_l) as their addresses are compared to determine equality in some important places. There are even helpful tests that verify COMDAT folding is disabled which helped me catch this😊 .
This could be backported to previous versions if desired, it should be simple and safe to do to whichever versions are deployed at scale in the wild, but I'll leave that to the community and maintainers to decide, I didn't want to go through the backport process.
Note: I am not sure how to build for PGO locally, but I think it's good to have this on for Configuration == PGInstrument and/or PGUpdate, so I used a Condition of "!= Debug". This should be verified in the real build pipeline that does PGO, or let me know how I can do it locally.
In terms of the size savings, here's a few example binaries and then the total across all of them in the build output folder - all measurements were done for amd64 Release binaries. It's over a 10% savings in the size of everything, and some binaries are considerably more than that.
https://bugs.python.org/issue42825