Mastodawn

Steve Springett

Jan 9

Jordan Harband Jan 9

I made a new thing! like the semver package, but for PURLs: https://www.npmjs.com/package/purl

`npx purl $specifier` or `npx purl $purl` will validate, normalize, and provide parse info.

add `--check` & it'll contact the relevant registry & verify the package and version exist.

(you can import it too)

Steve Springett

Apr 16, 2024

Show thread

Fiona Krakenbürger Apr 16, 2024

@joshbressers @ecosystems

And lastly, consider Slide 38 and it's conclusion:

"There’s nothing wrong with open source, this is how it works.

There’s something wrong with what we expect from open source."

We have a lot to think about.

Steve Springett

Mar 8, 2024

Chris Wysopal Mar 6, 2024

My recommendation is to move towards continuous code scanning. https://www.forbes.com/sites/forbestechcouncil/2024/03/06/software-companies-steps-to-take-now-to-prepare-for-the-ncsip/

Software Companies: Steps To Take Now To Prepare For The NCSIP

Reviewing the full details of the NCSIP and plotting a path forward will be among the most significant tasks for software companies in the coming months.

Forbes

Steve Springett

Mar 7, 2024

OWASP Dependency-Track Mar 7, 2024

Couldn’t attend this week’s Dependency-Track community meeting? No worries, we’ve got the recording.

@nscur0 leads us through the project roadmap. We also have special guests from the @CycloneDX #cryptography working group presenting #CBOM. Don’t miss it.

https://www.youtube.com/watch?v=0WPvVCRyLjw

Dependency-Track Community Meeting (2024-03-06)

YouTube

Steve Springett

Mar 4, 2024

Gary McGraw Mar 4, 2024

NEW BIML Bibliography entry

https://arxiv.org/abs/2307.01850

Self-Consuming Generative Models Go MAD

Alemohammad et al. (2023)

Clear results with a nice framework to describe fresh, synthetic, and fixed data in a feedback loop. Focuses on diversity versus over-precision. Recursive pollution example.

#MLsec

https://berryvilleiml.com/references/

Self-Consuming Generative Models Go MAD

Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previous generation models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease. We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.

arXiv.org

Steve Springett

Mar 4, 2024

Gary McGraw Mar 4, 2024

NEW BIML Bibliography entry

https://arxiv.org/abs/2311.17035

Scalable Extraction of Training Data from (Production) Language Models

Carlini et al

Another excellent, clear piece of work from Carlini. "We can pull the tokens right out." Extractable memorization (which we believe can be used to fingerprint datasets) should be useful in copyright legal cases involving LLM training data.

#MLsec

https://berryvilleiml.com/references/

Scalable Extraction of Training Data from (Production) Language Models

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

arXiv.org

Steve Springett

Feb 22, 2024

Ecma TC54 will be working towards standardizing Package URL, specifically purl, vers, and purl types. TC54 will be working out the details over the next few weeks. We invite everyone to learn more about TC54 and contribute to the advancement of Package URL.

Additionally, a #purl channel is now available on the CycloneDX Slack workspace where nearly 2K people are discussing SBOM and related technologies. Slack information and invite located on the TC54 website.

https://tc54.org/

#PackageURL #SBOM #OWASP #CycloneDX

Ecma TC54 | Software and System Transparency

Ecma Technical Committee 54 is chartered to standardize the OWASP CycloneDX Bill of Materials specification, standards and algorithms that advance transparency and identity, and the sharing of transparency information across the supply chain.

Steve Springett

Jan 24, 2024

OWASP Dependency-Track Jan 24, 2024

Join us for our next community meeting on January 31st at 3PM UTC (10:00am U.S. Eastern)!

Agenda:
- Project / Release Update (~15 min)
- How IBM CISO uses Dependency-Track (~10 min)
- Q&A (~30 min)

Calendar invite:
https://calendar.google.com/calendar/event?eid=MHE1ZXVpczAyZnAyOG9wcGluOTZiczI2NDEgY184ODRkZWNkZTVhMTUyOTAyYmI1MWE2MmY4OTU1MGQwZjM3NDg0ODQ1MzRmMDhjNjM3OTJmMmU2NTRmMmE3ZWJjQGc&ctz=UTC

#OWASP #SBOM #CycloneDX #EO14028

Dependency-Track Community Meeting

Steve Springett

Jan 23, 2024

Gary McGraw Jan 23, 2024

Find, Threat Model, Control. Three critical aspects of #MLsec

This story features: Legit Security

And mentions Irius Risk and Calypso AI. Am I proud? Yes indeedy.

https://www.darkreading.com/application-security/first-step-in-ai-ml-security-is-finding-them

First Step in AI/ML Security is Finding Them All

Security teams need to start including AI tools and machine learning models when thinking about the software supply chain: they can't protect what they don't know they have.

Steve Springett

Nov 10, 2023

OWASP Dependency-Track Nov 10, 2023

Join us for our very first community meeting on November 15th, 7PM UTC! Get to know the team, see what's up next, ask questions, and get a sneak peek into what we've been building with Project Hyades! Meeting details here:

https://calendar.google.com/calendar/event?eid=MjFiY2xvdW9wcXRxM203YTJkczNrdmFkczUgY184ODRkZWNkZTVhMTUyOTAyYmI1MWE2MmY4OTU1MGQwZjM3NDg0ODQ1MzRmMDhjNjM3OTJmMmU2NTRmMmE3ZWJjQGc&ctz=UTC

#OWASP #SBOM #EO14028

Bluesky	https://bsky.app/profile/stevespringett.bsky.social
LinkedIn	https://www.linkedin.com/in/stevespringett
GitHub	https://github.com/stevespringett