Mastodawn

Marcel Böhme Aug 27, 2024

Recent paper by MPI-SoftSec PhD student Niklas Risse. Feedback welcome!
📝 https://arxiv.org/abs/2408.12986
🧑‍💻 https://github.com/niklasrisse/TopScoreWrongExam

According to our survey of the machine learning for vulnerability detection (ML4VD) literature published in the top Software Engineering conferences, every paper in the past 5 years defines ML4VD as a binary classification problem:

Given a function, does it contain a security flaw?

In this paper, we ask whether this decision can really be made without further context and study both vulnerable and non-vulnerable functions in the most popular ML4VD datasets. A function is vulnerable if it was involved in a patch of an actual security flaw and confirmed to cause the vulnerability. It is non-vulnerable otherwise. We find that in almost all cases this decision cannot be made without further context. Vulnerable functions are often vulnerable only because a corresponding vulnerability-inducing calling context exists while non-vulnerable functions would often be vulnerable if a corresponding context existed.

But why do ML4VD techniques perform so well even though there is demonstrably not enough information in these samples? Spurious correlations: We find that high accuracy can be achieved even when only word counts are available. This shows that these datasets can be exploited to achieve high accuracy without actually detecting any security vulnerabilities.

We conclude that the current problem statement of ML4VD is ill-defined and call into question the internal validity of this growing body of work. Constructively, we call for more effective benchmarking methodologies to evaluate the true capabilities of ML4VD, propose alternative problem statements, and examine broader implications for the evaluation of machine learning and programming analysis research.

Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection

According to our survey of machine learning for vulnerability detection (ML4VD), 9 in every 10 papers published in the past five years define ML4VD as a function-level binary classification problem: Given a function, does it contain a security flaw? From our experience as security researchers, faced with deciding whether a given function makes the program vulnerable to attacks, we would often first want to understand the context in which this function is called. In this paper, we study how often this decision can really be made without further context and study both vulnerable and non-vulnerable functions in the most popular ML4VD datasets. We call a function "vulnerable" if it was involved in a patch of an actual security flaw and confirmed to cause the program's vulnerability. It is "non-vulnerable" otherwise. We find that in almost all cases this decision cannot be made without further context. Vulnerable functions are often vulnerable only because a corresponding vulnerability-inducing calling context exists while non-vulnerable functions would often be vulnerable if a corresponding context existed. But why do ML4VD techniques achieve high scores even though there is demonstrably not enough information in these samples? Spurious correlations: We find that high scores can be achieved even when only word counts are available. This shows that these datasets can be exploited to achieve high scores without actually detecting any security vulnerabilities. We conclude that the prevailing problem statement of ML4VD is ill-defined and call into question the internal validity of this growing body of work. Constructively, we call for more effective benchmarking methodologies to evaluate the true capabilities of ML4VD, propose alternative problem statements, and examine broader implications for the evaluation of machine learning and programming analysis research.

arXiv.org

Ash Nov 23

ᴮᵉⁿ ᴿᵒʸᶜᵉVOTE IN THE PRIMARIES Nov 23

As an American I'd like to imagine an alternative reality

Where I lived in a large #democracy in the Western hemisphere

That suffered a lying douchebag seizing power via a cult of bigots and ignorants

Who tried to claim #election hoax, tried to commit violence

But was convicted

Instead of seizing power again because of the fecklessness of his country's institutions

Oh wait

That's #Brazil #Brasil

"#Bolsonaro arrested to prevent ‘attempted escape,’ court says"

https://edition.cnn.com/2025/11/22/americas/brazil-jair-bolsonaro-arrested-intl

Brazil’s ex-president Bolsonaro arrested to prevent ‘attempted escape,’ court says

Former Brazilian President Jair Bolsonaro was arrested on Saturday by police to prevent a possible “attempted escape,” according to Brazil’s Supreme Court, days before he was due to begin a prison sentence for leading a coup attempt.

CNN