There are very few more heated arguments in this world that two C++ developers arguing about the correct way to do things.
This toot brought to you by the two day long argument I'm reading about `std::to_underlying` for C++ enum classes.

Also C++ devs who took Abrash's "measure to know the performance of something" waaaay to seriously.

Like, I can know certain practices hinder performance of things without measuring.

This is why SIMD APIs exist. Compilers are pretty magical, but they're not *that* magical.

This is something I find really really interesting about a lot of C++ devs. The language is supposed to be all about performance, but a lot of the arguments are about best practice within the language, proper architecture, avoiding footguns, and so often arguments specificly about high performance architectures are scoffed at in favor of "proper OOP" or assuming that you can't know something is high performance without measuring.

It's baffling.

As an aside, It reminds me of a discussion on some Swift forum about the performance of Arrays vs. ContiguousArray. Their testing showed no performance difference between the two (with Array actually being faster), and no one haad any answers as to why you would use one over the other.

The answer is cache that ContiguousArray can be more cache coherant in tight loops. But so few people know that and can come up with real tests that show the difference, so everyone just uses Array.

And by the time they get to a point where ContiguousArray would be needed, swiching would be a gigantic refactor, if they even THINK to make that change.
@fuzzybinary premature optimisation, no?

@belvo that is a very long and involved discussion.

The short(-ish) answer for me is that that statement really refers to algorithms over design. Internal algorithms are easy to change. Design is not. And some designs are just straight worse for performance on a modern cache based CPU.

The flip side of that statement is "Don't prematurely pessamize." And we unwittingly do it all the time, trapping ourselves into well intentioned complex OOP architectures that can't actually push performance.

@belvo people who believe in OOP a lot will try to talk about Programmer time and the cost of a programmer, which is true. But they ignore the fact that the principles of good design are not limited to OOP. You can design a usable, decoupled system without it, and not be stuck in the mindset that requires things like data hiding, interfaces for everything, dynamic allocations for everything, etc.

@belvo for example using the statement before: imagine you write Swift and you process a large amount of data enclosed in classes. You put those classes in an Array and process them in a thread. This runs slow but you have optimized every instruction.

Now, changing to ContiguousArray might get you better cache performance, but requires your class be a struct, a huge breaking change that could have massive impacts everywhere that class is used, depending on how its used.

@fuzzybinary I agree with all that! Probably it’s worth measuring performance at various intervals to properly assess which is implementation is better. Tiring and probably a bit slow, but science/experiment-driven
Time constraints tend to put a wrench in all that

@belvo So, this was the general response from a co-worker. But we *know* cache misses are orders of magnitude slower than any other instruction issued by a CPU -- this is documented overhead. But a cache miss vs hit has so many prerequisetes that it's hard to even come up with artifical tests that will cause cache misses.

So data design becomes one of the few ways to ensure you're using your processor efficiently, and avoiding cache misses.

@belvo Mike is much better at talking about this btw: https://www.youtube.com/watch?v=rX0ItVEVjHc
CppCon 2014: Mike Acton "Data-Oriented Design and C++"

YouTube
@belvo This slide in praticular sticks in my head: https://youtu.be/rX0ItVEVjHc?t=1831
- YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

@fuzzybinary I think the trick is that it’s different things to different people. I’ve worked with embedded engineers who were afraid of C++ because it was too high level, game devs who’ve loved it but had strict rules against OOP/inheritance/the virtual keyword, and app devs who considered it some sort of ancient artifact beyond comprehension “don’t touch the .so, it’s what makes our app fast!” but otherwise needed the oop abstractions. And all of them will duke it out on forums 😆
@pux0r3 Yeah. Actually, something I read in Effective C++ which I'd NEVER thought about (and that book is how old!?) that C++ is actually 3 languages, and it's practicies are different based on which version of the language you're using.
@pux0r3 The praticular interaction I'm a bit miffed about was talking about Data Oriented Design in C++ and its perf benefits because of cache coherency. The other dev (who is very smart and works in very high perf environments, don't get me wrong) was just kinda like "You can't know that you have to measure."

@fuzzybinary Definitely makes sense in the context of the ContiguousArray post sibling to this one too.

It also leads me into my mixed feelings around the statement of “pre-emptive optimization is the root of all evil.” On the one hand, min/maxing performance before an app is “done” is almost counter-productive, on the other hand lived experience usually says “I’ll probably have to do this eventually, it will look a bit ugly and not measure much now — but it’s saved my butt before!”

@fuzzybinary But also, I got someone very annoyed when I instinctively did `>> 1` or `* 0.5` instead of `/2` because they were once faster on an old compiler for an old architecture that doesn’t apply now. And I totally get it 😆

@pux0r3 I love one of the first best practices in C++ Best Practices: "Don't prematurely pessamize".

And you're right that some things are now dumb because the compiler can optimize it better than you, but specifically in C++ it *can't* rearrange data. And it can't skip constructors or destructors in certain circumstances. And it can't optimize away the v-table jump.

@fuzzybinary @pux0r3 I can definitely see that, the C++ I write (and I've seen a lot of other embedded and game devs write) I've seen kinda "jokingly" referred to as "C+", no exceptions, no rtti, single inheritance only as makes sense, generally little/no stdlib use. I think that's what makes it so versatile though, c++ has TONS of features, but in practice you just use whatever small subset and design patterns make sense for your own use case.

@raptor85 @fuzzybinary Ha, -noexcept and -nortti are even better examples of what I was thinking of too. I’ve worked in code bases that worked better on both sides of those flags 😆

I tend to play with different languages for fun, and I always end up back at C++ because it’s not _that_ broken and I can usually slurp in whatever I thought was cool from my latest round of new language spelunking. Just don’t tell the Rust devs that I’m not (yet) totally sold on the borrow checker.

@pux0r3 @raptor85 randomly relevant article Google surface to me not 10 seconds ago https://16bpp.net/blog/post/noexcept-can-sometimes-help-or-hurt-performance/
16BPP.net: Blog / `noexcept` Can (Sometimes) Help (or Hurt) Performance

@fuzzybinary @raptor85 the initial argument against exceptions at my first job was a vague "it's slower usually" mixed with "look at all this extra assembly this generates, do you want to read this?"

Now, even if it were slower (which I love this article for 😆), I've been radicalized into thinking it can be clearer to read and manage than piles of "if error" checks or the infamous `goto error` - and therefore faster in the "I can better reason about my code structure" way.