The quality of code generally is so-so, and that's what LLMs are trained on, so it's hardly a suprise that the output after not too long is a few 1000 lines of refactoring for you to deal with.
Which I guess makes sense 10x writing code is 10x reviewing code