Mastodawn

Daniel Kulenkamp May 9

Francehelder

🇧🇷May 6

#meme #IA #Claude

Daniel Kulenkamp Apr 19

Brandon Jones Apr 19

Parts of my job that I enjoy:
- Designing systems
- Writing elegant code
- Learning new things
- Solving interesting problems
- Sharing knowledge

Parts of my job I don't enjoy:
- Reviewing other people's code
- Debugging systems I'm not familiar with
- Blindly using poorly documented interfaces
- Dependency hell
- Being forced into the latest fad

AI enthusiasts: "What if I told you I've solved one of those lists for you?" 😏

Daniel Kulenkamp Apr 17

David Chisnall (*Now with 50% more sarcasm!*)Apr 17

A few notes about the massive hype surrounding Claude Mythos:

The old hype strategy of 'we made a thing and it's too dangerous to release' has been done since GPT-2. Anyone who still falls for it should not be trusted to have sensible opinions on any subject.

Even their public (cherry picked to look impressive) numbers for the cost per vulnerability are high. The problem with static analysis of any kind is that the false positive rates are high. Dynamic analysis can be sound but not complete, static analysis can be complete but not sound. That's the tradeoff. Coverity is free for open source projects and finds large numbers of things that might be bugs, including a lot that really are. Very few projects have the resources to triage all of these. If the money spent on Mythos had been invested in triaging the reports from existing tools, it would have done a lot more good for the ecosystem.

I recently received a 'comprehensive code audit' on one of my projects from an Anthropic user. Of the top ten bugs it reported, only one was important to fix (and should have been caught in code review, but was 15-year-old code from back when I was the only contributor and so there was no code review). Of the rest, a small number were technically bugs but were almost impossible to trigger (even deliberately). Half were false positives and two were not bugs and came with proposed 'fixes' that would have introduced performance regressions on performance-critical paths. But all of them looked plausible. And, unless you understood the environment in which the code runs and the things for which it's optimised very well, I can well imaging you'd just deploy those 'fixes' and wonder why performance was worse. Possibly Mythos is orders of magnitude better, but I doubt it.

This mirrors what we've seen with the public Mythos disclosures. One, for example, was complaining about a missing bounds check, yet every caller of the function did the bounds check and so introducing it just cost performance and didn't fix a bug. And, once again, remember that this is from the cherry-picked list that Anthropic chose to make their tool look good.

I don't doubt that LLMs can find some bugs other tools don't find, but that isn't new in the industry. Coverity, when it launched, found a lot of bugs nothing else found. When fuzzing became cheap and easy, it found a load of bugs. Valgrind and address sanitiser both caused spikes in bug discovery when they were released and deployed for the first time.

The one thing where Mythos is better than existing static analysers is that it can (if you burn enough money) generate test cases that trigger the bug. This is possible and cheaper with guided fuzzing but no one does it because burning 10% of the money that Mythos would cost is too expensive for most projects.

The source code for Claude Code was leaked a couple of weeks ago. It is staggeringly bad. I have never seen such low-quality code in production before. It contained things I'd have failed a first-year undergrad for writing. And, apparently, most of this is written with Claude Code itself.

But the most relevant part is that it contained three critical command-injection vulnerabilities.

These are the kind of things that static analysis should be catching. And, apparently at least one of the following is true:

Mythos didn't catch them.
Mythos doesn't work well enough for Anthropic to bother using it on their own code.
Mythos did catch them but the false-positive rate is so high that no one was able to find the important bugs in the flood of useless ones.

TL;DR: If you're willing to spend half as much money Mythos costs to operate, you can probably do a lot better with existing tools.

Anthropic Claude Code Leak Reveals Critical Command Injection Vulnerabilities

Anthropic's Claude Code CLI contains three critical command injection vulnerabilities that allow attackers to execute arbitrary code and exfiltrate cloud credentials via environment variables, file paths, and authentication helpers. These flaws bypass the tool's internal sandbox and are particularly dangerous in CI/CD environments where trust dialogs are disabled.

BeyondMachines

Daniel Kulenkamp Apr 13

Ben Zanin Apr 12

If all you do in your tech career is:

1. When something is slow, you look carefully at the output of a profiler or a query plan & make measured suggestions about what to improve;

2. When something breaks badly, you gently but insistently ask what & why until you truly know, then the next time similar work is needed you bring up how to avoid doing what broke last time; and

3. When someone lacks info, you make them feel good for learning instead of bad for not knowing;

You will do good work.

Daniel Kulenkamp Apr 13

shac ron ₪‎Apr 10

Good intentions: Forcing code into libraries to cleanly separate layers.
Reality: Making libraries circularly-dependent because layers are difficult.

Daniel Kulenkamp Jul 31, 2023

Manuel Correia Jul 6, 2023

"Piracy can't be stealing if paying for it isn't owning"

This is increasingly how it feels, when things you "own" digitally become inaccessible, and when shows which are locked in streaming services can get delisted on a whim, disappearing forever.

Thank you @Illuminatus for this quote.

[Edit: In case this wasn't clear, this is about entertainment and digital ownership, not physical goods]

Daniel Kulenkamp Jul 30, 2023

Thom Holwerda Jun 27, 2023

When you listen to The Verge's podcast and they're legitimately saying Mastodon is a "no girls allowed" club? And we don't want Facebook here because Instagram will bring women here?!

What the fuck is wrong with these American tech pundits? Mastodon is the gayest, transest, most feminine social network I've ever seen. There are so many more outspoken, smart, odd, and downright weird women here than I've ever seen on any other social network. It's fucking great.

Daniel Kulenkamp Jul 29, 2023

I love when I go to put a new book on my to read list, and it’s already there 🥰

It usually means I should read that book next!

#bookstodon #books

Daniel Kulenkamp Jul 29, 2023

Show thread

Zhi Zhu 🕸️Jan 11, 2023

The Paradox of Tolerance disappears if you look at tolerance, not as a moral standard, but as a social contract.

If someone does not abide by the terms of the contract, then they are not covered by it.

In other words: The intolerant are not following the rules of the social contract of mutual tolerance.

Since they have broken the terms of the contract, they are no longer covered by the contract, and their intolerance should NOT be tolerated.

[inspired by “Tolerance is not a moral precept”]

Show thread

Daniel Kulenkamp Jul 29, 2023

Haunted Mansion was enjoyable enough, but nothing special in my opinion. I thought the ride tie ins were clever, but the jokes didn’t always land, and the plot was mostly predictable.

I know it’s a Disney movie but I would have upped the scariness and tried to do something more unexpected with the plot.

That being said, I’ll take Tiffany Haddish and Jamie Lee Curtis as pshychis any day!

#HauntedMansion #MiniMovieReview