Another rabbit hole, so now it is a blog post. "C++ vector math library codegen in Debug build". Long read, some conclusions at the end. https://aras-p.info/blog/2024/09/14/Vector-math-library-codegen-in-Debug/
Vector math library codegen in Debug · Aras' website

Aras' website
@aras heh, when spelunking disassembly recently i noticed we were globally enabling JMC on *release* builds, due to some ancient issue with library versioning. glad i found that before shipping :') msvc sure does have some fun curveballs.
@aras btw it's not mentioned in the blog, but one thing you can often do is enable force inlining on the hot functions, which can help a bit since for most compilers that applies on the no optimizations build as well.
@dotstdy can you actually do that on MSVC? How?
@aras @dotstdy AFAIK, you can't. Even __forceinline is a hint to the compiler, albeit a very strong one.
It will, however, honour that hint in almost all cases.
@molecularmusing @aras yeah it'll refuse in some cases still, but for the most part (and especially for small functions like in a vector library) it should end up inlining them regardless of the compilation mode if you use forceinline.
@dotstdy @molecularmusing that’s just not my experience. /Od by default inlines *nothing*, ie it implies /Ob0. The only next choice is /Ob1 which is “in-line almost everything”. There’s no setting that tells “only force inlined functions plz” https://learn.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion?view=msvc-170
/Ob (Inline Function Expansion)

Learn more about: /Ob (Inline Function Expansion)

@aras @dotstdy @molecularmusing Is "anything marked inline" really "almost everything"? :D
@SonnyBonds @dotstdy @molecularmusing that’s inline, force inline plus all members of all classes that are defined inside the class. A lot!

@aras @dotstdy @molecularmusing Member functions defined in a class definition are implicitly inline (in the sense of the inline keyword) by the C++ standard so I think it makes sense. Granted, there could be a __forceinline vs inline distinction but I'd say that's a failing of the language conflating two concepts a bit.

If I may ask, what's the reason _not_ wanting inlining (in the optimization sense) for functions like that? I'm thinking it often correlates quite well.

@aras @dotstdy @molecularmusing (To clarify, I agree an "inline only functions marked __forceinline" could make sense as an option inbetween /Ob0 "inline nothing" and /Ob1 "inline anything marked some kind of inline", but I kind of understand why the current /Ob1 is the way it is.)
Compiler Explorer - C++

#ifdef _MSC_VER #define INLINE __forceinline #define NOINLINE __declspec(noinline) #else #define INLINE __attribute__((always_inline)) #define NOINLINE __attribute__((noinline)) #endif static inline INLINE float add(float x, float y) { return x + y; } NOINLINE float accumulate(float *values, size_t num_values) { float acc = 0.0f; for (size_t i = 0; i < num_values; i++) { acc = add(acc, values[i]); } return acc; }

@SonnyBonds @aras @molecularmusing I do wish the compiler would avoid the unnecessary spills around the function call, for values which are already spilled. My gut feeling is that is a contributor to a lot of the call heavy abstraction performance issues in debug mode too. but at least it's better than an actual function call for every tiny layer in the "zero cost abstraction" pile.
@dotstdy @aras @molecularmusing Yeah you need to add /Ob1 to get any inlining at all.