My experience with using #codex as a code audit tool are mixed.
1. in high / max reasoning, it can find logic / math / design flaws, but
2. it does not have the capability to assess a design. Its suggestions will break existing logics
I can imagine the future of code design will be more atomic, with many small functions and a human connecting the design to an acyclic flow graph.
That makes it more auditable and you can repurpose nodes / atomic functions rather than custom functions per implementation.