Mastodawn

Aras Pranckevičius Sep 15, 2024

Another rabbit hole, so now it is a blog post. "C++ vector math library codegen in Debug build". Long read, some conclusions at the end. https://aras-p.info/blog/2024/09/14/Vector-math-library-codegen-in-Debug/

Vector math library codegen in Debug · Aras' website

Aras' website

Show thread

Josh Simmons Sep 15, 2024

@aras heh, when spelunking disassembly recently i noticed we were globally enabling JMC on *release* builds, due to some ancient issue with library versioning. glad i found that before shipping :') msvc sure does have some fun curveballs.

Show thread

Josh Simmons Sep 15, 2024

@aras btw it's not mentioned in the blog, but one thing you can often do is enable force inlining on the hot functions, which can help a bit since for most compilers that applies on the no optimizations build as well.

Show thread

Aras Pranckevičius Sep 15, 2024

@dotstdy can you actually do that on MSVC? How?

Show thread

Stefan Reinalter Sep 15, 2024

@aras @dotstdy AFAIK, you can't. Even __forceinline is a hint to the compiler, albeit a very strong one.
It will, however, honour that hint in almost all cases.

Show thread

Josh Simmons Sep 15, 2024

@molecularmusing @aras yeah it'll refuse in some cases still, but for the most part (and especially for small functions like in a vector library) it should end up inlining them regardless of the compilation mode if you use forceinline.

Show thread

Aras Pranckevičius Sep 15, 2024

@dotstdy @molecularmusing that’s just not my experience. /Od by default inlines *nothing*, ie it implies /Ob0. The only next choice is /Ob1 which is “in-line almost everything”. There’s no setting that tells “only force inlined functions plz” https://learn.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion?view=msvc-170

/Ob (Inline Function Expansion)

Learn more about: /Ob (Inline Function Expansion)

Show thread

Anders Stenberg

@aras @dotstdy @molecularmusing Is "anything marked inline" really "almost everything"? :D

Show thread

Aras Pranckevičius Sep 15, 2024

@SonnyBonds @dotstdy @molecularmusing that’s inline, force inline plus all members of all classes that are defined inside the class. A lot!

Show thread

Anders Stenberg Sep 15, 2024

@aras @dotstdy @molecularmusing Member functions defined in a class definition are implicitly inline (in the sense of the inline keyword) by the C++ standard so I think it makes sense. Granted, there could be a __forceinline vs inline distinction but I'd say that's a failing of the language conflating two concepts a bit.

If I may ask, what's the reason _not_ wanting inlining (in the optimization sense) for functions like that? I'm thinking it often correlates quite well.

Show thread

Anders Stenberg Sep 15, 2024

@aras @dotstdy @molecularmusing (To clarify, I agree an "inline only functions marked __forceinline" could make sense as an option inbetween /Ob0 "inline nothing" and /Ob1 "inline anything marked some kind of inline", but I kind of understand why the current /Ob1 is the way it is.)

Show thread

Josh Simmons Sep 15, 2024

@SonnyBonds @aras @molecularmusing thanks msvc :'( https://gcc.godbolt.org/z/sWzETKoe7

Compiler Explorer - C++

#ifdef _MSC_VER #define INLINE __forceinline #define NOINLINE __declspec(noinline) #else #define INLINE __attribute__((always_inline)) #define NOINLINE __attribute__((noinline)) #endif static inline INLINE float add(float x, float y) { return x + y; } NOINLINE float accumulate(float *values, size_t num_values) { float acc = 0.0f; for (size_t i = 0; i < num_values; i++) { acc = add(acc, values[i]); } return acc; }

Show thread

Josh Simmons Sep 15, 2024

@SonnyBonds @aras @molecularmusing I do wish the compiler would avoid the unnecessary spills around the function call, for values which are already spilled. My gut feeling is that is a contributor to a lot of the call heavy abstraction performance issues in debug mode too. but at least it's better than an actual function call for every tiny layer in the "zero cost abstraction" pile.

Show thread

Anders Stenberg Sep 15, 2024

@dotstdy @aras @molecularmusing Yeah you need to add /Ob1 to get any inlining at all.