@[email protected] @[email protected] I have been critical of the KeePassXC team's decision to accept LLM-assisted submissions for awhile now, and after long consideration I opted to move away from using it, security being one of my primary concerns. My view is that you're setting up a bit of a strawman in this post, and so I thought I'd elaborate more on my rationale in case it helps anyone who's weighing this decision too. The tl;dr is that code review should be your last line of defense, not your primary one, and that LLM use threatens to erode existing lines of defense while introducing new categories of risk. This is the opposite of what you should be doing when developing security-oriented software.
Here's a lot more words:
To my way of thinking the question isn't whether any given pull request is problematic. One of the KeePassXC maintainers who I've interacted with seemed to suggest this as well, that human beings sometimes submit poor quality pull requests too so what's the difference, especially if the review process catches them? The important question, and the difference, lies with the culture of the development team and process. Experience shows that security-oriented software in particular benefits greatly from a team of people dedicated to both transparency and the relentless pursuit of excellent implementations. My belief is that the use of LLMs in coding threatens both of those aspects, degrading them over time. Transparency, because no one can know exactly what an LLM is going to produce and why, and an LLM cannot tell you anything about its output; excellent implementations because (a) come on, have you ever looked at LLM output, especially for larger chunks of code; and (b) the only way we've ever found to produce excellent implementations of anything is by developing a well-functioning team of people and setting them loose on it.
Peter Naur famously argued that programming is theory building, and theories draw their power from their existence in the heads of the people who construct them. I am convinced by his argument by the simple fact that over the course of my career I've worked with large codebases written by other people, and have experienced firsthand that the only way to really understand the code is by talking to other people who understand the code. No one can look at a large codebase and understand how it works, not even with the best documentation in the world--not in a reasonable amount of time, anyway. Anyone who believes this hasn't picked up a non-trivial APL program and tried to figure out what it does. Anyone who believes this is mistaken about the practice of software development and engineering, and probably also believes in the myth of the 10x engineer or that women can't code as well as men, too.
LLMs are not people. They do not understand code. They cannot describe their thought process to you. They cannot point you to the most important functions, procedures, methods, or objects. They cannot give you hints about pitfalls you might fall into while working with their code. Any understanding like this that arises about LLM-generated code arises because human beings developed that understanding of the code and then communicated it.
LLMs are trained on masses of mediocre code. Their output has been found to include significantly more bugs and security issues than the average human-written code. Their use has been observed to result simultaneously in reduced productivity and a belief that productivity was increased, suggesting they might induce other blindspots in one's self-awareness too. Their use has been observed to result in de-skilling: people become less able to do things they used to be able to do without leaning on the tool. Given all that, I do not believe for a moment that an LLM can produce an excellent implementation, nor foster a culture in which excellent implementations arise; and I believe that any excellent implementation produced by a person using an LLM is a result of the person compensating for the weaknesses and traps of LLM use, all while it potentially degrades their future ability to produce excellent implementations and fools them into believing they wrote better code faster when they did not.
A good review process does not compensate for any of the issues I raise here. More importantly, actual security is about layers of protection. The code review is one of the last layers of protection. There should be many, many others, which to me includes a culture that does not succumb to the temptation to put a stochastic black box deskilling machine into the software development process. You wouldn't build a fortress with an open road leading into the center just because you had guards you could post on the road (it lets us get in and out faster, that portcullis is so slow!). You'd have the guards, and you'd have several layers of thick walls, and you'd have a moat, and you'd have archers, and... You certainly wouldn't voluntarily pull a giant wooden horse that could contain anything into your fortress!
I suspect that a project adopting more and more LLM-assisted submissions will not obviously suffer in the near term, but over the medium to long term is likely to develop issues, originating in one or more of my above observations, that eventually lead to problems in the software. As I said to someone about KeePassXC, I am not inclined to hitch my wagon to that train. Not when it comes to a piece of software like a password manager.
And that's not even opening up the moral and ethical issues of LLMs, which are substantial. Not to mention the dangers of becoming dependent on a technology and tools that might go away or become significantly more expensive when the asset bubble currently necessary for their continued existence finally deflates.
Other people might come to a different place, but for me this is more than enough reason to switch password managers.