https://www.youtube.com/watch?v=UBhT7nbWpMg


So if you're writing Python, you do need to understand what will become a dictionary, and roughly how duck-typing works.
If you're writing a JITted language, you should roughly understand how the JITter works.
If you're writing C# or other managed language, be aware of how the GC works, and you should be able to write the equivalent C++ code without too much trouble.
Writing C++ code? You should be able to mentally translate this into C. Where are the creator/destructors happening, and what do they turn into? Is that function call virtual? How does that work in practice, and what are the perf implications?
If you're writing C, you really should be able to read assembly, and know what instructions are available, how flow control turns into branches, what a cache line is, what happens when you run out of registers.
And if you're writing assembly, you should absolutely know the principles of branch prediction, micro-ops, how atomic instructions work, and M(O)ESI(F).
But there's no need for a Python coder to know the whole stack. Even if they do (like I happen to) - if they try to think about them all at once, their head will explode!
@Doomed_Daniel It's a rule of thumb. In general, I find that most C++ perf problems are because you're doing too many alloc/frees, and too many virtual calls. Those are solved by thinking about C.
Yes, if you then discover you're thrashing caches or causing too many branch mispredictions, then that's actually a good problem to have - it means your code is faster than almost all other C++ code. Well done.
@TomF re translating C++ to C: this would have been pretty much trivial to do back in the days of Cfront.
I think it may still even be possible with compilers that use the EDG front end...? (Or at least, if you get the front end directly from EDG, it has a C-generating back-end just like Cfront did.)
But if you're using, say, Clang, you probably want to think in terms of LLVM IR, or possibly the new ClangIR (and its translation to LLVM IR).
@Doomed_Daniel It's a rule of thumb. In general, I find that most C++ perf problems are because you're doing too many alloc/frees, and too many virtual calls. Those are solved by thinking about C. Yes, if you then discover you're thrashing caches or causing too many branch mispredictions, then that's actually a good problem to have - it means your code is faster than almost all other C++ code. Well done.
@JamesWidman @TomF
other bad news: other compilers are still relevant :-p
MSVC still is dominant on Windows and so is GCC on Linux