🚀 49,000 Patches !
We’ve updated our dataset (https://huggingface.co/datasets/CIRCL/vulnerability-cwe-patch) of real-world vulnerabilities, now enriched with #CWE identifiers and #patches collected from platforms like GitHub, GitLab, Bitbucket.
This dataset is designed to support the development of tools for vulnerability classification. Dataset features are:
- #CVE / #GHSA ID
- Title of the #vulnerability
- Vulnerability description
- Patches (URL, Commit message, and Base64-encoded unified diff)
- CWE categorization
