Having fun poking at the IR peephole optimizer. As of this afternoon's changes, IR instructions in md5_step() are reduced by 29% by the peephole optimizer and dead instruction elimination.
Here's a basic block in that function before and after this stage of optimization. These are mostly helping flatten array/struct address computation right now.
