Johnny's Software Lab

107 Followers
2 Following
43 Posts
We help development teams speed up their C/C++ software.
Performance-related blog: http://johnnysswlab.com
Direct help: http://johnnysswlab.com/consulting

Vectorization doesn't always speed things up because of SIMD math.

Sometimes it speeds things up because it forces you to overlap independent dependency chains.

New post:
https://johnnysswlab.com/exposing-more-parallelism-is-the-hidden-reason-why-some-vectorized-loops-are-faster-not-vectorization-per-se/

Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster - Not Vectorization per se - Johnny's Software Lab

I was preparing an article about Highway – portable vectorization library by Google – so I ported a few examples from my vectorization workshop from AVX to Highway. One of the examples was vectorized binary search. I assume most readers are familiar with simple binary search. It looks something like this: We take a lookup… Read

Johnny's Software Lab

🚀 New on Johnny’s Software Lab: Handling floating-point errors in C++ without killing performance!
NaNs, infinities, sticky bits, and traps — what works and what’s a trap 🪤.

Read more: https://johnnysswlab.com/floating-point-error-handling-in-c-what-actuall

#Cpp #Performance #FloatingPoint #NaN #Infinity

Floating-Point Error Handling in C++: What Actually Works - Johnny's Software Lab

Floating-point errors are unavoidable, but how you detect and handle them can make the difference between clean, high-performance C++ code and a debugging nightmare. In this article, we explore the practical techniques for handling NaNs, infinities, and other FP errors — from manual checks to sticky bits and hardware traps — and reveal which approaches actually work without sabotaging performance.

Johnny's Software Lab

📬 Mailing list is live!
New articles + workshop dates (AVX, NEON) straight to your inbox.

👉 Go to johnnysswlab.com
➡️ Enter your email in the box on the right

Still thinking #Java is slow? A deep dive into Java vs C++ performance will show what are its strengths and what are its weaknesses.

https://johnnysswlab.com/deep-dive-in-java-vs-c-performance/

#java #javaperformance #garbagecollector #jvm

Deep Dive in Java vs C++ Performance - Johnny's Software Lab

For most of my career I lived in the world of C and C++, and I honestly believed that these languages are the pinnacle of software performance. But two months ago I started working at Azul, the maker of low-latency Java compiler and I had an opportunity to deep dive into Java performance. And it… Read

Johnny's Software Lab
9 Things Every Fresh Graduate Should Know About Software Performance - Johnny's Software Lab

At Johnny’s Software Lab we’ve spent a lot of time deep-diving into advanced performance topics — vectorization, cache hierarchies, memory bandwidth, you name it. But not everyone is ready to jump straight into assembly listings and microarchitectural details. This post is for the beginners. For the fresh graduates and junior developers who are just starting… Read

Johnny's Software Lab

We investigate vector functions, more specifically, how to make your vector function available to the compiler's autovectorizer!

#vectorfunctions #simd #openmp #omp

https://johnnysswlab.com/the-messy-reality-of-simd-vector-functions/

The messy reality of SIMD (vector) functions - Johnny's Software Lab

We’ve discussed SIMD and vectorization extensively on this blog, and it was only a matter of time before SIMD (or vector) functions came up. In this post, we explore what SIMD functions are, when they are useful, and how to declare and use them effectively. A SIMD function is a function that processes more than… Read

Johnny's Software Lab

Does it matter if we are compiling with optimizations off (O0) or optimizations on (O3) if the problem is memory bound? Let’s find out…

#optimizations #performance #instructionlevelparallelism #ilp #compiler #gcc #memorybound

https://johnnysswlab.com/an-optimizing-compiler-doesnt-help-much-with-long-instruction-dependencies/

An optimizing compiler doesn't help much with long instruction dependencies - Johnny's Software Lab

Does it matter if we are compiling with optimizations off (O0) or optimizations on (O3) if the problem is memory bound? Let's find out...

Johnny's Software Lab

Last chance to register for AVX vectorization workshop!

More info: https://johnnysswlab.com/avx-neon-vectorization-workshop/
Register: [email protected]

AVX/NEON Vectorization Workshop - Johnny's Software Lab

UPCOMING VECTORIZATION WORKSHOPSAVX Vectorization Workshop: 4 half days, May 26th to May 29th 11 AM – 3 PM (US East Coast) 8 AM – 12 PM (US West Coast) 5 PM – 9 PM CET (Europe)NEON Vectorization Workshop: TBD, send e-mail to [email protected] to express interest For software developers and companies who wish to learn… Read

Johnny's Software Lab

A post on how to grow data buffers without memcpy:

https://johnnysswlab.com/growing-buffers-to-avoid-copying-data/

Your program doesn’t run fast enough? You need someone to talk to about your software’s performance? You or your team want to learn to write faster software? Whatever it is, we can help you. Check out the consulting page for more info.

https://johnnysswlab.com/consulting/

Growing Buffers to Avoid Copying Data - Johnny's Software Lab

Copying data can be expensive in some cases, especially since it it doesn’t change the data, it’s just moves it. Therefore we, engineers interested in performance, want to avoid copying data as much as possible. We already talked about avoiding data copying in C++ earlier. In that post, we talked about what mechanism C++ has… Read

Johnny's Software Lab

Another AVX workshop, this time 4 half-days.

Registration or inquiry: [email protected]

More info: https://johnnysswlab.com/avx-neon-vectorization-workshop/