Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for simple SIMD features detection in autoconf (PoC) #125011

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Oct 5, 2024

In #124951, there has been some initial discussion on improving the performances of base64 and possibly {bytes,str}.translate using SIMD instructions.

More generally, if we want to use specific SIMD instructions, it'd be good if we at least know whether we can use them or not. This PR is a PoC (hence the draft and the skip news/issue for now just to make the CI green for those checks).

Note that the detection is essentially based on what was done in the blake2 module (though the flags being detected are different in this case). This PoC shows that we can easily add flags and compiler flags if we need them.

Note

The detection of a wider large family of CPUs (here we just assume Intel for simplicity and because we don't want to be overcomplicated for now) is still an ongoing work.

cc @gpshead

@picnixz picnixz added skip issue skip news build The build process and cross-build labels Oct 5, 2024
Python/cpuinfo.c Outdated Show resolved Hide resolved
Python/cpuinfo.c Outdated Show resolved Hide resolved
Include/internal/pycore_cpuinfo.h Outdated Show resolved Hide resolved
Include/internal/pycore_cpuinfo.h Outdated Show resolved Hide resolved
#include <stdbool.h>

typedef struct {
bool sse, sse2, sse3, sse41, sse42, avx, avx2, avx512vbmi;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's suggested that you define sse and and separately, which will be more intuitive.

bool sse, ...;
bool avx, ...;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and I'll add some comments.


#include <stdbool.h>

#if defined(__x86_64__) && defined(__GNUC__)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the unique header file of the GNU-collection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the only one containing __cpuid_count.

edx7 = info7[3];
#else
// use (void) expressions to avoid warnings
(void) eax1; (void) ebx1; (void) ecx1; (void) edx1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended to add an unused macro.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I don't want to add an unnecessary additional macr.

Comment on lines +69 to +99
flags->sse = (edx1 & EDX1_SSE) != 0;
#else
flags->sse = false;
#endif
#ifdef CAN_COMPILE_SIMD_SSE2_INSTRUCTIONS
flags->sse2 = (edx1 & EDX1_SSE2) != 0;
#else
flags->sse2 = false;
#endif
#ifdef CAN_COMPILE_SIMD_SSE3_INSTRUCTIONS
flags->sse3 = (ecx1 & ECX1_SSE3) != 0;
#else
#endif
flags->sse3 = false;
#ifdef CAN_COMPILE_SIMD_SSE4_1_INSTRUCTIONS
flags->sse41 = (ecx1 & ECX1_SSE4_1) != 0;
#else
flags->sse41 = false;
#endif
#ifdef CAN_COMPILE_SIMD_SSE4_2_INSTRUCTIONS
flags->sse42 = (ecx1 & ECX1_SSE4_2) != 0;
#else
flags->sse42 = false;
#endif
#ifdef CAN_COMPILE_SIMD_AVX_INSTRUCTIONS
flags->avx = (ecx1 & ECX1_AVX) != 0;
#else
flags->avx = false;
#endif
#ifdef CAN_COMPILE_SIMD_AVX2_INSTRUCTIONS
flags->avx2 = (ebx7 & EBX7_AVX2) != 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Bool is no different from int. In fact, this redundant can be removed !=.

Copy link
Contributor Author

@picnixz picnixz Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be safely converted.

#include "Python.h"
#include "pycore_cpuinfo.h"

#include <stdbool.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that stdbool has been defined in Python.h Maybe this can be removed it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to include the entire Python if I don't need to. And I prefer explicit includes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build The build process and cross-build skip issue skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants