Contrarian: build a dumb, rule‑based baseline for any feature before you call an LLM. If generated code doesn’t measurably beat that baseline on correctness, perf, or size, it’s not progress — it’s vendor noise. Measure wins, don’t assume them.