Mastodawn

It is spooky how much @rygorous and I agree on so many many things. So in lieu of me making a video of my ramblings, watch his instead.
https://www.youtube.com/watch?v=UBhT7nbWpMg

Breadth vs Depth in Programming

YouTube

Show thread

Tom Forsyth Apr 24

Fabian talked some about how languages are fundamentally tools, and they're all flawed, and no one language will solve all problems. And I absolutely agree. Some people spend a lot of time creating new languages, and I'm just not sure it's productive.

Show thread

Tom Forsyth Apr 24

I can't find it in the video, but there was a question about how far down the stack you need to know. I mean Fabian and I like to know all the way down to the metal. But that's because we're strange. Most people do not - it's absolutely not necessary to get stuff done.

Show thread

Tom Forsyth Apr 24

But these two things are related - language choice, and how far down you need to know. Because I do think that whatever language you use, you should know the "next level down". When you're writing code, you should be able to fairly simply translate in your head to the language below it.

Show thread

Tom Forsyth

So if you're writing Python, you do need to understand what will become a dictionary, and roughly how duck-typing works.

If you're writing a JITted language, you should roughly understand how the JITter works.

If you're writing C# or other managed language, be aware of how the GC works, and you should be able to write the equivalent C++ code without too much trouble.

Show thread

Tom Forsyth Apr 24

Writing C++ code? You should be able to mentally translate this into C. Where are the creator/destructors happening, and what do they turn into? Is that function call virtual? How does that work in practice, and what are the perf implications?

If you're writing C, you really should be able to read assembly, and know what instructions are available, how flow control turns into branches, what a cache line is, what happens when you run out of registers.

Show thread

Tom Forsyth Apr 24

And if you're writing assembly, you should absolutely know the principles of branch prediction, micro-ops, how atomic instructions work, and M(O)ESI(F).

But there's no need for a Python coder to know the whole stack. Even if they do (like I happen to) - if they try to think about them all at once, their head will explode!

Show thread

Tom Forsyth Apr 24

So a top-notch coder should always be *thinking* one level below what you're writing. If you can't easily tell what's happening in that lower level, then your code is probably too complex and/or meta, and you're creating "write only code" - a debugging or performance headache for your future self.

Show thread

David J. Atkinson Apr 24

@TomF Absolutely. 💯 If nothing else, you can clobber yourself if you don’t know how data structures are implemented one level down. For example, I once had to use strings for implementing dynamic strided arrays. Lots of appending, prepending, insertion, deletion, etc. There was a x^2 penalty increase in memory and computation if you appended versus prepended! This was only evident at run-time. Luckily, someone else mentioned this online so I was able to make the optimization without suffering. You will be much happier if you know how your source language is implemented (compiled or interpreted).

Show thread

mattpd Apr 25

@TomF On that note, also like John Ousterhout's "Always measure one level deeper", https://al.radbox.org/doi/10.1145/3213770.
> "If you want to understand the
performance of a system at a particular
level, you must measure not just that
level but also the next level deeper. That
is, measure the underlying factors that
contribute to the performance at the
higher level."

Always measure one level deeper

Performance measurements often go wrong, reporting surface-level results that are more marketing than science.

Show thread

Robert Klaschka Apr 25

@TomF this reminds me of a point made in my early architectural design (buildings) career that you need to understand a design at the scale below the scale you were drawing at.

Show thread

Michael Eggers 🇺🇦🇪🇺Apr 25

@TomF Yes! Yes! Yes!!!

Show thread

Tom Forsyth Apr 24

Side note - this also applies to shader languages. You need to know how code is translated to predicated SIMD, and roughly how flow control and atomic operations work on the hardware. It is hugely helpful to e.g. look at AMD's GCN/RDNA assembly and see how your code turned into real instructions.

Show thread

Daniel Gibson Apr 24

@TomF
Doesn't that propagate down?
Like, if I have a performance problem in C++, how does it help me to know what the equivalent C-code is if I need to know assembly to figure out why that code would be slow?

Show thread

Tom Forsyth Apr 24

@Doomed_Daniel It's a rule of thumb. In general, I find that most C++ perf problems are because you're doing too many alloc/frees, and too many virtual calls. Those are solved by thinking about C.

Yes, if you then discover you're thrashing caches or causing too many branch mispredictions, then that's actually a good problem to have - it means your code is faster than almost all other C++ code. Well done.

Show thread

James Widman Apr 24

@TomF re translating C++ to C: this would have been pretty much trivial to do back in the days of Cfront.

I think it may still even be possible with compilers that use the EDG front end...? (Or at least, if you get the front end directly from EDG, it has a C-generating back-end just like Cfront did.)

But if you're using, say, Clang, you probably want to think in terms of LLVM IR, or possibly the new ClangIR (and its translation to LLVM IR).

Show thread

Daniel Gibson Apr 24

@JamesWidman @TomF
IMO having a mental model of what vtables are (a pointer to a list of function pointers), where implicit allocation happens (and that heap allocations have real cost), that inlining functions is a thing and when it can (not) or is (un)likely to happen etc already gets you pretty far (see also https://mastodon.gamedev.place/@TomF/116461364931250271 )
even if you don't know how exactly the specific compiler might optimize further with its IR

Tom Forsyth (@[email protected])

@Doomed_Daniel It's a rule of thumb. In general, I find that most C++ perf problems are because you're doing too many alloc/frees, and too many virtual calls. Those are solved by thinking about C. Yes, if you then discover you're thrashing caches or causing too many branch mispredictions, then that's actually a good problem to have - it means your code is faster than almost all other C++ code. Well done.

Gamedev Mastodon

Show thread

James Widman Apr 24

@Doomed_Daniel @TomF well... yes, but my point was: if the compiler doesn't actually translate to C, then you're not going to be able to check your expectations. So ideally, you want to think in terms of the stuff that it actually generates (so that you can use the compiler's output to correct yourself when you inevitably hit cases where your mental model turns out to be incorrect).

Show thread

James Widman Apr 24

@Doomed_Daniel @TomF The bad news is that it kinda leads you to spend more time understanding LLVM IR and the optimization pipeline than you probably expected; the good news is that this will be transferable to other source languages.

Show thread

Daniel Gibson Apr 24

@JamesWidman @TomF
other bad news: other compilers are still relevant :-p

MSVC still is dominant on Windows and so is GCC on Linux

Show thread

Tom Forsyth Apr 24

@JamesWidman @Doomed_Daniel I strongly disagree. I think this is not a productive skill to learn, if you're not already into writing compilers.

Show thread

Tom Forsyth Apr 24

@JamesWidman I'm not referring to "how does a compiler work". I'm describing how a programmer should think about the code they're writing. Very few people know how how LLVM IR works, and having looked at plenty myself I still find it extremely hard to parse. Asking people to think about phi nodes in their daily coding does not seem useful.

Show thread

synlogic4242 Apr 24

@TomF I started in 6502 assembly and only later learned C. makes one appreciate it more too.

Show thread

Tom Forsyth Apr 24

@synlogic4242 I only recently learned the details of how the 68000 microcode works, and it's fascinating because inside a 68000 is effectively a 16-bit 6502, and the 68k register file is its page 0. So the "levels" concept works there as well!

Show thread

synlogic4242 Apr 24

@TomF nice!