34 Followers
459 Following
1.4K Posts
searchable

A few notes about the massive hype surrounding Claude Mythos:

The old hype strategy of 'we made a thing and it's too dangerous to release' has been done since GPT-2. Anyone who still falls for it should not be trusted to have sensible opinions on any subject.

Even their public (cherry picked to look impressive) numbers for the cost per vulnerability are high. The problem with static analysis of any kind is that the false positive rates are high. Dynamic analysis can be sound but not complete, static analysis can be complete but not sound. That's the tradeoff. Coverity is free for open source projects and finds large numbers of things that might be bugs, including a lot that really are. Very few projects have the resources to triage all of these. If the money spent on Mythos had been invested in triaging the reports from existing tools, it would have done a lot more good for the ecosystem.

I recently received a 'comprehensive code audit' on one of my projects from an Anthropic user. Of the top ten bugs it reported, only one was important to fix (and should have been caught in code review, but was 15-year-old code from back when I was the only contributor and so there was no code review). Of the rest, a small number were technically bugs but were almost impossible to trigger (even deliberately). Half were false positives and two were not bugs and came with proposed 'fixes' that would have introduced performance regressions on performance-critical paths. But all of them looked plausible. And, unless you understood the environment in which the code runs and the things for which it's optimised very well, I can well imaging you'd just deploy those 'fixes' and wonder why performance was worse. Possibly Mythos is orders of magnitude better, but I doubt it.

This mirrors what we've seen with the public Mythos disclosures. One, for example, was complaining about a missing bounds check, yet every caller of the function did the bounds check and so introducing it just cost performance and didn't fix a bug. And, once again, remember that this is from the cherry-picked list that Anthropic chose to make their tool look good.

I don't doubt that LLMs can find some bugs other tools don't find, but that isn't new in the industry. Coverity, when it launched, found a lot of bugs nothing else found. When fuzzing became cheap and easy, it found a load of bugs. Valgrind and address sanitiser both caused spikes in bug discovery when they were released and deployed for the first time.

The one thing where Mythos is better than existing static analysers is that it can (if you burn enough money) generate test cases that trigger the bug. This is possible and cheaper with guided fuzzing but no one does it because burning 10% of the money that Mythos would cost is too expensive for most projects.

The source code for Claude Code was leaked a couple of weeks ago. It is staggeringly bad. I have never seen such low-quality code in production before. It contained things I'd have failed a first-year undergrad for writing. And, apparently, most of this is written with Claude Code itself.

But the most relevant part is that it contained three critical command-injection vulnerabilities.

These are the kind of things that static analysis should be catching. And, apparently at least one of the following is true:

  • Mythos didn't catch them.
  • Mythos doesn't work well enough for Anthropic to bother using it on their own code.
  • Mythos did catch them but the false-positive rate is so high that no one was able to find the important bugs in the flood of useless ones.

TL;DR: If you're willing to spend half as much money Mythos costs to operate, you can probably do a lot better with existing tools.

Anthropic Claude Code Leak Reveals Critical Command Injection Vulnerabilities

Anthropic's Claude Code CLI contains three critical command injection vulnerabilities that allow attackers to execute arbitrary code and exfiltrate cloud credentials via environment variables, file paths, and authentication helpers. These flaws bypass the tool's internal sandbox and are particularly dangerous in CI/CD environments where trust dialogs are disabled.

BeyondMachines

When you read about Bans of Social Media for Teens and Age Verification, you must remember what it truly means:

• Official identification of every adult using social media.

• Deanonymization of every account, endangering groups that often rely on pseudonymity for safety, such as victims of domestic violence, victims of stalkers, people of color, and LGBTQ+ people.

• Putting every adult at great danger of exploitation, fraud, and identity theft by forcing them to share their official ID with a for-profit third-party company with no incentive to protect it. Breaches have already happened.

• Constructing a system of mass surveillance to attach every comment on social media to a legal identity. Effectively allowing authoritarian governments to silence their critics and opposition.

• Potential for dystopian censorship and cutting off means of organization for groups of resistance to oppressive regime and organizations.

• Endangering children online by putting a clear identification beacon over every child or family with children online.

• Endangering the data of children who will inevitably try to pass as adults, and have their information collected by the third-party for-profit company.

• Diminishing the value of official identification due to the inevitable data breaches, eventually pushing the system to require even more intrusive identification techniques, such as iris scans and fingerprints.

• Installing a system of mass surveillance capable of attaching even more information to everyone's legal identity. With a potential to built list of people in certain groups, and scale-up state censorship and discrimination in unprecedented ways.

• The list goes on and on.

This isn't about protecting the children.
It never was.

Do not be duped by this excuse used to convince you to let go of your human rights. They are only trying to manipulate people lacking information.

Stay informed on the issues related to Age Verification, and push back for your rights to privacy and democracy.

The future depends on us.

#AgeVerification #Privacy #HumanRights #MassSurveillance #Authoritarianism

Breakglass is producing so many many many detailed daily threat intel reports... just wow

Are those AI written or did they forget their threat intel portal exposed online for free? 🤣

https://intel.breakglass.tech/

Breakglass Intelligence

Malware analysis, APT campaigns, detections, and IOCs from Breakglass

The Trap of "Vibe Coding" and the Rise of Engineering as a Service. The Ultimate Vendor Lock-In. https://hackernoon.com/the-trap-of-vibe-coding-and-the-rise-of-engineering-as-a-service #ai
The Trap of "Vibe Coding" and the Rise of Engineering as a Service | HackerNoon

The Trap of "Vibe Coding" and the Rise of Engineering as a Service. The Ultimate Vendor Lock-In.

Small models also found the vulnerabilities that Mythos found

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier

#ai

AI Cybersecurity After Mythos: The Jagged Frontier

Why the moat is the system, not the model

AISLE

If you're a security professional thinking about building something you have no business building: do it. Just bring your engineering brain along for the ride 🙂

https://accidental-ciso.alevsk.dev/

Happy hacking 💻 🏴‍☠️

Calling all Astro.js developers.

My rewilding charity @ProtectEarthUK has been trying to replace a horrible Squarespace site with an API-powered data-centric website.

We urgently need a few volunteers to get this over the line. Grab an issue if you can so we can reach enough parity to kill the old site, then we can work on amazing things.

Get your boss to let you do this as a charity day instead of coming out and planting trees with me.

https://github.com/protect-earth/website

Call for Testing: Laptop Integration Testing Project

We’re expanding the Laptop Support and Usability Project and inviting the community to help test FreeBSD on real hardware.

-Which laptop works best with FreeBSD?
-Will my current laptop support the features I need?
-What configuration tweaks might be required?

Testing is automated, anonymized, and straightforward, and your feedback helps improve FreeBSD for everyone.
Learn how to participate:
https://freebsdfoundation.org/blog/call-for-testing-introducing-the-laptop-integration-testing-project/
#FreeBSD #OpenSource