Ben Ashbaugh

@bashbaug@mastodon.gamedev.place
82 Followers
42 Following
126 Posts
Interested in GPUs, parallel programming, music, coffee, and hiking. Opinions are my own.
Taming the energy hogs – Professor Pekka Jääskeläinen develops sustainable computing solutions | Tampere universities

Professor of Computing Sciences Pekka Jääskeläinen develops technologies that enhance computing performance while reducing the energy demands of computing systems. Another key focus of his research...

Tampere universities

The SYCL Working Group has announced the release of Revision 11 of the SYCL 2020 Specification, introducing eight powerful new extensions alongside numerous specification clarifications.

Learn more: https://www.khronos.org/blog/khronos-releases-sycl-2020-rev-11-specification-with-eight-new-extensions

If you're at Supercomputing 2025, be sure to see us at the SYCL Birds of a Feather Session:

November 18, 12:15pm-1:15pm
Location: Room 274
#SYCL

The second adds device timing histograms. This is useful to understand how a workload executes at a high level, across all kernels. It is especially useful to identify outliers if the execution time for some kernels varies based on kernel inputs or other factors.
The first adds conditional profiling. This is useful to restrict profiling to the specific regions of an application that you care about, while minimizing overhead and ignoring profiling data for unimportant regions. The conditional profiling is controlled by an environment variable so it can easily be used by many programming languages.

I need to get better about announcing improvements to the OpenCL Intercept Layer, so: I merged two new features this week that I think are pretty neat. As a reminder, the OpenCL Intercept Layer is an open source tool for debugging and profiling OpenCL applications. It works with most OpenCL implementations and requires no application modifications. #OpenCL

https://github.com/intel/opencl-intercept-layer

GitHub - intel/opencl-intercept-layer: Intercept Layer for Debugging and Analyzing OpenCL Applications

Intercept Layer for Debugging and Analyzing OpenCL Applications - intel/opencl-intercept-layer

GitHub
IWOCL 2026 call for submissions is open: https://www.iwocl.org/call-for-submissions/ if you have interesting OpenCL or SYCL related work that might interest others please consider submitting a talk, paper or a poster!
IWOCL 2026 Call for Submissions

IWOCL has has been attracting an international audience of leading academic and industrial experts since 2013 and is the premier forum for the the OpenCL and SYCL community.

IWOCL
Hey #HPC people, in a week I'm going to #SC25 in St. Louis, USA! 🖖🤠🇺🇸
Meet me at the #Intel booth (2227). Looking forward to talk to you!
Here's a 𝑠𝑚𝑎𝑙𝑙 teaser for the bigger #GPU​s I'll be showing off 🖖😋🟦
SC25 website 👉 https://sc25.supercomputing.org/

AMD Contributes BFloat16 Support To LLVM's SPIR-V Target

AMD software engineers continue making interesting contributions to the LLVM compiler stack around SPIR-V as the IR used by Vulkan and other Khronos APIs...
https://www.phoronix.com/news/AMD-BF16-For-LLVM-SPIR-V

I got an Arc Pro B50 for testing, and it's really smoll. Same chip as the B580, but cut down from 2560 to 2048 cores, and the memory bus reduced from 192 to 128-bit. But VRAM is increased from 12 to 16GB. And with only 70W TDP it's super efficient and doesn't need external power. 🖖😋

Also tested 4x Arc Pro B60 24GB in mutli-GPU. 🖖🤪

- 4x B60 in FluidX3D: https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#multi-gpu-benchmarks
- B50 in FluidX3D: https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#single-gpucpu-benchmarks
- B60 OpenCL: https://opencl.gpuinfo.org/displayreport.php?id=5863
- B50 OpenCL: https://opencl.gpuinfo.org/displayreport.php?id=5829