My experience with using #codex as a code audit tool are mixed.

1. in high / max reasoning, it can find logic / math / design flaws, but

2. it does not have the capability to assess a design. Its suggestions will break existing logics

I can imagine the future of code design will be more atomic, with many small functions and a human connecting the design to an acyclic flow graph.

That makes it more auditable and you can repurpose nodes / atomic functions rather than custom functions per implementation.