Ornithologist at VulnCheck
| Pronouns | He/him |
| Locale | Portland, OR |
| Timezone | PST |
| Pronouns | He/him |
| Locale | Portland, OR |
| Timezone | PST |
This photo is going to be in the history books eventually.
#USpol
“The Sergeant at Arms blocks Representative Justin Pearson from Memphis from entering a committee meeting about redrawing the map specifically for the district he represents.
A white officer with a badge, blocking a black congressional member, from joining meeting that specifically targets the black district meant to give representation to people of color.
The segregation is alive and well in the south.” - Brian Garlick
Is it possible people are not looking at Glasswing as a project in a broader sweep of the development of the cyber reasoning system (CRS)?
I think it is tempting to have this kind of myopia because Anthropic is one developer among many. Additionally, and more interesting, they peddle a non-public model, when in reality their major deliverables have involved harnesses that look at lot like a formal CRS.
When you contextualize the model as one product among a diverse set of fuzzers, symbolic analyzers, and generalizable harnesses, the product seems less revolutionary. But by that token it is much easier to track the progress of state of art and acknowledge projects like AIxCC and OSS-CRS as increasingly standardizing what was originally a box of toys.
The "how" problem is one of the most aggravating things to me with this bubble.
Since when do venture capitalists care how engineers solve problems? Since when does management? Are they going to start caring about what color our cats are too?
This is a classical failing because, partially due to their skill mix, but also majorly because of their limited vantage point and specialization, these personnel are NOT SUITED for doing due diligence!
But Jonathan, you may protest: if they cannot discern methods, isn't this how we get fraud, like with the case with Delve?
I think that is always a risk, but it shows what the real role for people who are gatekeepers. They need to take verification seriously. Not in the sense of formal verification, but to take the position of diverse customers and make sure (as tradition) the use case makes sense and isn't going to sink the boat, and (as heterodox) make sure that results, when socialized into the community broadly, holds up to basic scrutiny.
Peer review is just the starting hurdle, and in an ideal world, we would be moving towards more formal methods. It's stupid that well capitalized organizations break down at that first hurdle.
quick STS 2 thoughts:
I have a deep personal relationship with my Osty.
I don't understand why the Crabby Rathbun, matplotlib thing is a story.
There is zero proof to the premise that the issue or blog post is synthetic, besides people's read of the work and it's "tells" of LLM authorship.
We have no idea what operations came from the GitHub account. We neither know its runtime nor input and output. This is not resolvable alone by open source, nor by forensic logs. It's turtles all the way down.
Likewise, there are troubling implications if we do not hold Rathbun's author liable for any of its actions (not that any one of them are crimes or even faux pas).
I do not understand the breathless attitude of tech journalists to this story. Can we please be skeptical of claims, for just one moment?