Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Fallback code paths with "SIMD Everywhere" #1091
Comments
They have been merged. However, full merger is not possible nor desirable. You do not want to have just one code path. You want to have distinct code for each kernels to account for the differences. It is not the case that one algorithm works best on all platforms all of the time. There are differences. For example, SIMD instructions under x86_64 typically do not support unsigned arithmetic while ARM NEON does.
That might be interesting if we can improve the performance. Pull requests invited! I'll tweak the title of this issue. |
FWIW, if there is anything missing from SIMDe (unlikely) I would be happy to add it. If nothing else I'd be interested in profiling data to look for optimization opportunities in SIMDe :)
I fully agree here; SIMDe isn't going to be as fast as a port written by someone who knows NEON. At best it could be useful to merge some stuff which happen to work well on both architectures but AFAIK simdjson already has a good NEON port so if it's not getting in the way I don't think you would gain much from merging code paths.
Odds pretty good for that, which would be very nice for WASM and POWER (unless simdjson already has WASM SIMD 128 and/or AltiVec/VSX implementations?). I know MMseq2 did this and got a nice performance boost. Plus you could drop the separate portable implementation if you want, and deleting code is always fun. Most people are focused on the x86 support, but we do also have pretty extensive support for using NEON on other architectures. That might be a better route to go than using x86 functions since other architectures tend to be more NEON-like than SSE-like (including supporting unsigned arithmetic). It could also make working on the NEON code much easier since you can do it on your x86 machine without emulators, cross-compilers, remote debugging, etc. |
I have retitled the issue. It is quite valid, I think. Note that the arch-specific part is quite tiny as it is, so it would not be massive work to add simde. |
https://github.com/simd-everywhere/simde has done their first release ( https://simd-everywhere.github.io/blog/announcements/release/2020/06/21/0.5.0-release.html ) and it seems like something potentially useful for merging code paths. The most likely candidate I imagine is using SIMD code to replace the non-SIMD fallback path, but maybe it'd be worth checking if it can be used to merge ARM and x86_64 or support other architectures