-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for simple SIMD features detection in autoconf (PoC) #125011
base: main
Are you sure you want to change the base?
Conversation
#include <stdbool.h> | ||
|
||
typedef struct { | ||
bool sse, sse2, sse3, sse41, sse42, avx, avx2, avx512vbmi; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's suggested that you define sse
and and
separately, which will be more intuitive.
bool sse, ...;
bool avx, ...;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and I'll add some comments.
|
||
#include <stdbool.h> | ||
|
||
#if defined(__x86_64__) && defined(__GNUC__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the unique header file of the GNU-collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the only one containing __cpuid_count.
edx7 = info7[3]; | ||
#else | ||
// use (void) expressions to avoid warnings | ||
(void) eax1; (void) ebx1; (void) ecx1; (void) edx1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended to add an unused
macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I don't want to add an unnecessary additional macr.
flags->sse = (edx1 & EDX1_SSE) != 0; | ||
#else | ||
flags->sse = false; | ||
#endif | ||
#ifdef CAN_COMPILE_SIMD_SSE2_INSTRUCTIONS | ||
flags->sse2 = (edx1 & EDX1_SSE2) != 0; | ||
#else | ||
flags->sse2 = false; | ||
#endif | ||
#ifdef CAN_COMPILE_SIMD_SSE3_INSTRUCTIONS | ||
flags->sse3 = (ecx1 & ECX1_SSE3) != 0; | ||
#else | ||
#endif | ||
flags->sse3 = false; | ||
#ifdef CAN_COMPILE_SIMD_SSE4_1_INSTRUCTIONS | ||
flags->sse41 = (ecx1 & ECX1_SSE4_1) != 0; | ||
#else | ||
flags->sse41 = false; | ||
#endif | ||
#ifdef CAN_COMPILE_SIMD_SSE4_2_INSTRUCTIONS | ||
flags->sse42 = (ecx1 & ECX1_SSE4_2) != 0; | ||
#else | ||
flags->sse42 = false; | ||
#endif | ||
#ifdef CAN_COMPILE_SIMD_AVX_INSTRUCTIONS | ||
flags->avx = (ecx1 & ECX1_AVX) != 0; | ||
#else | ||
flags->avx = false; | ||
#endif | ||
#ifdef CAN_COMPILE_SIMD_AVX2_INSTRUCTIONS | ||
flags->avx2 = (ebx7 & EBX7_AVX2) != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_Bool
is no different from int
. In fact, this redundant can be removed !=
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be safely converted.
#include "Python.h" | ||
#include "pycore_cpuinfo.h" | ||
|
||
#include <stdbool.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that stdbool
has been defined in Python.h
Maybe this can be removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to include the entire Python if I don't need to. And I prefer explicit includes.
In #124951, there has been some initial discussion on improving the performances of base64 and possibly
{bytes,str}.translate
using SIMD instructions.More generally, if we want to use specific SIMD instructions, it'd be good if we at least know whether we can use them or not. This PR is a PoC (hence the draft and the skip news/issue for now just to make the CI green for those checks).
Note that the detection is essentially based on what was done in the blake2 module (though the flags being detected are different in this case). This PoC shows that we can easily add flags and compiler flags if we need them.
Note
The detection of a wider large family of CPUs (here we just assume Intel for simplicity and because we don't want to be overcomplicated for now) is still an ongoing work.
cc @gpshead