The best production compiler to study is LLVM
Look, I love LLVM!39 In many ways, it’s pretty educational. But it’s a monstrosity! That’s because LLVM is applied to so many use cases. LLVM is used from embedded systems to Google’s data centers, from JIT compilers to aggressive high-performance scenarios, and from Zig to Rust to <name the next new language>.
This seems like a good time to point out that even Zig has run into problems with LLVM. Its limitations are significant contributors to the removal of async from the language.
Nice article. For the optimization related ones there’s a good rule of thumb: it’s not an optimization if you don’t measure an improvement.
it’s not an optimization if you don’t measure an improvement.
This, so much. I often see this at work, theories about this and that being slow, or how something new « should » be better. As the resident code elder I often get to reply « why don’t you measure and find out »…
For some reason it seems nobody uses sampling profilers anymore, those tell you exactly where the time is spent, you don’t have to guess from a trace. Optimizing code is kind of a lost art.
Sometimes I wish compilers were better at data cache optimization, I work with C++ all the time but all of it needs to be done by hand (for example SoA vs AoS…). On the upside, I kind of have a good job security, at least until I retire.
I hate when coworkers tell we should do thing in a particular way because it’s ”better”. We try the thing and there’s no measurable difference. Well, it was a good idea in their mind, so it must be an improvement. Therefore, they insist it should be kept, even if it makes the code extra convoluted for no reason at all.
And yes. Profiling is great. Often it is a surprise where most time is spent. Today there’s few excuses not to profile since most IDEs has a good enough profiler included.
Now, because this article got a little long, as per a friend’s suggestion, here’s a table of contents:
Optimization gives us optimal programs
Branch weights and the CPU’s branch predictor
-O3 produces much faster code than -O2
Javascript interpreters JIT at runtime because they don’t know which paths are hot ahead of time
If you have a compiler, you don’t need an interpreter
The middle-end is target/platform-independent
The compiler optimizes for data locality
-O0 gives you fast compilation
Templates are slow to compile
Separate compilation is always worth it
Why does link-time optimization (LTO) happen at link-time?
Inlining is useful primarily because it eliminates a call instruction
The role of the inline keyword
The best production compiler to study is LLVM
Undefined behavior only enables optimizations
The compiler can “simply” define undefined behavior 99% correctness rate is ok
- The compiler hates you
- The compiler sees nothing wrong with your code and is just giving error messages to look busy
- The compiler was written by a maniac with a semicolon fixation
- The compiler could optimize your code, but it’s just not feeling it today
- The compiler wonders why you can’t match braces right like your brother does
- The compiler had a torrid affair with a Perl interpreter in 2006
- The compiler likes big butts and cannot lie
- The compiler wants to grow up to be an IDE but hasn’t told its parents they need to send it to GUI school yet
- The compiler reads Nazis on Twitter but only to make fun of them
- The compiler works for the Servants of Cthulhu