I had a flaky test in a PR. I asked Copilot for advice. It offered me a way that did not fix the root cause, but that made the test "pass" (by silently ignoring an exception).
I looked myself for ca 10 minutes, and I noticed a nasty race condition in my code, which I could easily fix properly. That was the root cause.
I'm worried when people are using these tools without *understanding* ("but I'm looking over AI's shoulder!" -> yeah right)