Skip to content
#

high-performance-computing

Here are 516 public repositories matching this topic...

hipSYCL
illuhad
illuhad commented Sep 6, 2021

Bug summary
There is evidence that sub_group::get_group_id() does not return the same value as threadIdx.x / warpSize (assuming 1D kernel), as expected on CUDA. We should check the implementation of this function. Our implementation of this function performs bit manipulation magic, presumably the optimization went to far...

To Reproduce
Compare sub_group{}.get_group_id() or `sub

Improve this page

Add a description, image, and links to the high-performance-computing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the high-performance-computing topic, visit your repo's landing page and select "manage topics."

Learn more