Mastodawn

Researchers from Stanford University have found that programmers who accept help from AI tools like Github Copilot produce less secure code than those who fly solo.

I’m sure you’d find the same thing of developers who blindly cut & paste from Stack Overflow.

The root issue is the same, people who don’t understand the code they got from someone else write less secure code. Doesn’t matter if it was an AI or Stack Overflow poster.

https://www.theregister.com/2022/12/21/ai_assistants_bad_code/

Study finds AI assistants help developers produce code that's more likely to be buggy

At the same time, tools like Github Copilot and Facebook InCoder make developers believe their code is sound

The Register

Show thread

Serge Egelman Dec 26, 2022

@carnage4life Such a study was done on developers copying from Stack Overflow! (And they found exactly what you would expect.)
https://ieeexplore.ieee.org/document/7958574

Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security

Online programming discussion platforms such as Stack Overflow serve as a rich source of information for software developers. Available information include vibrant discussions and oftentimes ready-to-use code snippets. Previous research identified Stack Overflow as one of the most important information sources developers rely on. Anecdotes report that software developers copy and paste code snippets from those information sources for convenience reasons. Such behavior results in a constant flow of community-provided code snippets into production software. To date, the impact of this behaviour on code security is unknown. We answer this highly important question by quantifying the proliferation of security-related code snippets from Stack Overflow in Android applications available on Google Play. Access to the rich source of information available on Stack Overflow including ready-to-use code snippets provides huge benefits for software developers. However, when it comes to code security there are some caveats to bear in mind: Due to the complex nature of code security, it is very difficult to provide ready-to-use and secure solutions for every problem. Hence, integrating a security-related code snippet from Stack Overflow into production software requires caution and expertise. Unsurprisingly, we observed insecure code snippets being copied into Android applications millions of users install from Google Play every day. To quantitatively evaluate the extent of this observation, we scanned Stack Overflow for code snippets and evaluated their security score using a stochastic gradient descent classifier. In order to identify code reuse in Android applications, we applied state-of-the-art static analysis. Our results are alarming: 15.4% of the 1.3 million Android applications we analyzed, contained security-related code snippets from Stack Overflow. Out of these 97.9% contain at least one insecure code snippet.

Show thread

Lindsay Holmwood Dec 26, 2022

@v0max @carnage4life good news from 12 months ago:

This study found 1 out of the top 3 Google results pointed to insecure Stack Overflow answers 22% of the time

BUT

By applying custom boosts based on semantic analysis of comments (calling accepted answers secure/insecure) led devs to more secure results a statistically significant percentage of time

https://dl.acm.org/doi/pdf/10.1145/3460120.3484763

Show thread

(((JaneinNJ)))Dec 26, 2022

@carnage4life Total non-tech person here, but that sounds logical. I would imagine that letting AI write your code is a lot like copying/pasting someone else’s language. You might sort of understand it, but it’s not organic to your way of thinking.

Show thread

Tanya Janca | SheHacksPurple

Dec 26, 2022

@carnage4life 💯

Show thread

RRB Dec 26, 2022

@carnage4life Alternatively, and I think this is the real problem with AI, it is mirroring its training data. Most code is crap, so you get a crap code generator.

Similar to Microsoft's Tay chatbot becoming a Nazi misogynist from Internet training data.

You may be able to create tools that work as well as the bottom 50%of the work force at best.

Show thread

Morgan Dec 26, 2022

@rrb @carnage4life I think these are both true. AI code is crap because most code is crap, but also a lot of code is crap because people were relying on other people and not actually learning to code themselves
(I'm not someone who never asks for help or is against it ftr, I mean what OP said about using other people's code without inspecting it)

Show thread

Morgan Dec 26, 2022

@carnage4life I think you're right, but it's probably worse for the first group *if* evaluating the work they got help with. People who cut and paste from SO w/o examination would do much worse on their own than people who generally code on their own, and similar or the same as those who use AI tools, but if we include the code they're actually *using,* that they got help with, the AI stuff is known to be wrong a lot. Whereas people are less likely to answer if they don't know what they're doing

Show thread

spherulitic Dec 26, 2022

@carnage4life ChatGPT, your hallucinating coding buddy, has entered the room.

Show thread

Joshka Dec 26, 2022

@carnage4life I was thinking about this the other day. This is just a 'today' type problem IMO. The thing that helps that is to train copilot etc. on changesets (commits etc.) that fix CVEs with the hope that generally code that fixes security holes is generally more secure. Train it on refactorings that lead to more readabile code and feed it into R# and other tools. Feed it commit messages and LSP knowledge and hook it up directly to your AST for some feature development props. That's exciting

Show thread

Matthias Dec 26, 2022

@carnage4life In the linked article is that nice example of SQL injection.
A very basic thing a developer should be a aware of (and it's variations and consequences).
Because of the way the worldwide IT Industry works today with spreaded team members somewhere, I made experiences that some so called experts do not know anymore how to develop correctly with frameworks, as they simply made string concatenations they googled somewhere without applying even basic test driven thinking.

Show thread

WGAvanDijk Dec 26, 2022

@carnage4life
That may be true now, but maybe in 10 years, it will be totally different. AI's are developed to be better and better.

As for humans, my partner, @lizmat , often said (and I am not sure if it is her own text or that she got it from somebody else) that you should not code any program when you are at peak performance.

Because a week later, when you are at average performance, you will no longer understand your own code, it will be waaaay above your level.

Document your code!

Show thread

Profoundly Nerdy Dec 27, 2022

@carnage4life This is the correct answer.