Mastodawn

tired_n_bored

Hey we solved software development, no need to learn programming anymore (Claude's source code leak)

https://lemmy.world/post/45282644

Show thread

puchaczyk 10h ago

So the innovation in Claude was to write 95% of the prompt for the user and make you use like 10k tokens

Show thread

floquant 6h ago

The problem is that words don’t have meaning in the genAI field. Everything is an agent now. So it’s difficult and confusing to compare strategies and performance.

Claude Code is a pretty solid harness. And a harness is indeed just prompts and tools.

Show thread

JackbyDev 52m ago

✨agent✨

Sort of like how everything is an “app” now.

Show thread

Hamartiogonic 9h ago

Just write good code. It’s as simple as that, right?

Show thread

renzhexiangjiao 9h ago

“make no mistakes”

Show thread

Madrigal 9h ago

I’ve literally seen someone include “Don’t hallucinate” in an agent’s instructions.

Show thread

rozodru 8h ago

Asking Claude to not hallucinate is like telling a person to not breathe. it’s gonna happen, and happen conistently.

Show thread

FrederikNJS 7h ago

I think the important bit to understand here is that LLMs are never not hallucinating. But they sometimes happens to hallucinate something correct.

Show thread

James R Kirk 6h ago

This fact of how LLMs work is not at all widespread enough IMO.

Show thread

driving_crooner 7h ago

“Include no bugs”

Show thread

lath 9h ago

That’s a what if, just in case it gains sentience. Gotta make sure we get good code even as it enslaves or extinguishes us.

Show thread

one_old_coder 9h ago

They are spending thousands of dollars in tokens and write the most complicated prompts in order to avoid writing good specifications.

Show thread

yetAnotherUser 8h ago

That may actually work a little?

I mean, it scraped the entirety of StackOverflow. If someone answered with insecure code, it’s statistically likely people mentioned it in the replies meaning the token “This is insecure” (or similar) should be close to (known!!) insecure code.

Show thread

addie 7h ago

I was part of that OWASP Application Security Verification Standards compliance at my work. At a high level, you choose a compliance level that suitable for the environment you expect your app to be deployed in, and then there’s a hundred pages of ‘boxes to tick’. (Download here.)

Some of them are literal ‘boxes to tick’ - do you do logging in the proscribed way? - but a lot of it is:

do you follow the standard industry protocols for doing this thing?
can you prove that you do so, and have protocols in place to keep it that way?

Not many of them are difficult, but there’s a lot of them. I’d say that’s typical of security hardening; the difficulty is in the number of things to keep track of, not really any individual thing.

As regards the ‘have you used this thing in the correct, secure way?’, I’d point my finger at something like Bouncy Castle as a troublemaker, although it’s far from alone. It’s the Java standard crypto library, so you think there would be a lot of examples showing the correct way to use it, and make sure that you’re aware of any gotchas? Hah hah fat chance. Stack Overflow has a lot of examples, a lot of them are bad, and a lot of them might have been okay once but are very outdated. I would prefer one absolutely correct example than a hundred examples have argued over, especially people that don’t necessarily know any better. And it’s easy to be ‘convincing but wrong’, and LLMs are really bad in that case. So ‘ticking the box’ to say that you’re using it correctly is extremely difficult.

I see the Claude prompt is ‘OWASP top 10’, not ‘the full OWASP compliance doc’, which would probably set all your tokens on fire. But it’s what’s needed - the most slender crack in security can be enough to render everything useless.

Show thread

Ibuthyr 8h ago

Writing all these prompts almost seems like a more time-consuming thing than actually programming the software.

Show thread

Sundray 7h ago

Absolutely true, but executives kind of understand prompts whereas they don’t understand programming at all.

Show thread

underisk 5h ago

I would wager quite a lot that less than one out of every ten executives could properly explain what an SQL injection is, or even know the term at all.

Show thread

Fargeol 7h ago

Relevant XKCD: www.xkcd.com/424/

Security Holes

xkcd

Show thread

melsaskca 7h ago

Programming is the use of logic and reasoning. There will always be a use for that. Even without tech.

Show thread

Damage 4h ago

“Claude, add to this prompt all the instructions necessary to stop you from making mistakes or writing insecure code”

Show thread

8oow3291d 4h ago

So I don’t know if all the other replies are pretending to be stupid, but the shown prompt is not stupid.

If you include stuff like that section in your prompt, then it has been shown that the AI will be more likely to output secure code. Hence of course the section should be included in the prompt.

If it looks stupid but it works, then it is not stupid.

Show thread

Chais 3h ago

Firstly, it can work and still be stupid.
Secondly, since the chat bot is more likely but not certain to write secure, bug-free code, it does not in fact work and is therefore, by your own reasoning, stupid.
But so is asking a chat bot for code to begin with, so there wasn’t ever really a way around that.

Show thread

8oow3291d 2h ago

since the chat bot is more likely but not certain to write secure, bug-free code, it does not in fact work

Humans are not certain to write secure, bug-free code. So human code is useless, by the very same metric?

What kind of “logic” is that?

Show thread

JcbAzPx 2h ago

Humans understand the concepts of “writing code” and “bug fixing”. Chat bots do not understand, period.

Show thread

Rose 3h ago

“Don’t put in any of the Top 10 vulnerabilities. But if you put any from the 11th place and down, that’s okay, I don’t even know what those are.”

(Also, getting flashbacks from Shadiversity plugging “ugly art” and “bad anatomy” in the negative prompt as he was no doubt silently wondering why it didn’t work)

Show thread

SlurpingPus 28m ago

“In other news, popularity of attacks against OWASP vulnerabilities #11-20 rose sharply.”

Show thread

JackbyDev 56m ago

I sort of get the need to do this, but it’s so silly to be. Reminds me of how giving Stable Diffusion negative prompts for “bad” and “low quality” would give you better results.