Steve Springett 

202 Followers
128 Following
139 Posts
Technologist, husband, father, tequila aficionado. Creator of @DependencyTrack. Co-author of OWASP SCVS. Chair of @CycloneDX Core Working Group. Ecma TC54 Chair. OWASP Global Board of Directors. #AppSec leader at ServiceNow. Chicago
Blueskyhttps://bsky.app/profile/stevespringett.bsky.social
LinkedInhttps://www.linkedin.com/in/stevespringett
GitHubhttps://github.com/stevespringett

I made a new thing! like the semver package, but for PURLs: https://www.npmjs.com/package/purl

`npx purl $specifier` or `npx purl $purl` will validate, normalize, and provide parse info.

add `--check` & it'll contact the relevant registry & verify the package and version exist.

(you can import it too)

@joshbressers @ecosystems

And lastly, consider Slide 38 and it's conclusion:

"There’s nothing wrong with open source, this is how it works.

There’s something wrong with what we expect from open source."

We have a lot to think about.

@ariadne Fantastic. Would love to catch up with you. Let me see if I can make it up there this month and will let you know in a day or two.

@ariadne this thread sure sparked a lot of dialog, which is great. IMO, there should be more inclusive conferences, especially for open source.

For security conferences, I’ve always found that OWASP regional and global appsec conferences to be super inclusive and affordable. Additionally, OWASP chapter meetings are free and open to anyone to attend.

BSides are also great to attend. You get some super talented individuals, usually on the offensive or research side, attending them. Again, super affordable. Local security cons are also great. I'm in Chicago and attend THOTCON whenever I get the chance.

For open source, I'd love to go to FOSDEM, but have never had the opportunity to go. It’s difficult to justify the expense of travel, etc, for attending it when my primary day job is not open source. Now, if there were a North American FOSDEM or equivalent, I'd be able to justify that much easier.

But I also try to avoid the big conferences when possible. I attend RSA every 5-6 years because I try to avoid the circus. But this year, I will be there. But it's the same reason I've never attended an LF conference. They're too big. I love to go to conferences where they're small enough that you can sit down with speakers and other SMEs and just talk. I remember my first time going to OWASP AppSec Cali (since been renamed) and it was an amazing experience. You had some of the brightest minds in application security at this small/regional conference and you could just sit and talk with people without all the corporate things getting in the way.

Software Companies: Steps To Take Now To Prepare For The NCSIP

Reviewing the full details of the NCSIP and plotting a path forward will be among the most significant tasks for software companies in the coming months.

Forbes

Couldn’t attend this week’s Dependency-Track community meeting? No worries, we’ve got the recording.

@nscur0 leads us through the project roadmap. We also have special guests from the @CycloneDX #cryptography working group presenting #CBOM. Don’t miss it.

https://www.youtube.com/watch?v=0WPvVCRyLjw

Dependency-Track Community Meeting (2024-03-06)

YouTube

NEW BIML Bibliography entry

https://arxiv.org/abs/2307.01850

Self-Consuming Generative Models Go MAD

Alemohammad et al. (2023)

Clear results with a nice framework to describe fresh, synthetic, and fixed data in a feedback loop. Focuses on diversity versus over-precision. Recursive pollution example.

#MLsec

https://berryvilleiml.com/references/

Self-Consuming Generative Models Go MAD

Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previous generation models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease. We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.

arXiv.org

NEW BIML Bibliography entry

https://arxiv.org/abs/2311.17035

Scalable Extraction of Training Data from (Production) Language Models

Carlini et al

Another excellent, clear piece of work from Carlini. "We can pull the tokens right out." Extractable memorization (which we believe can be used to fingerprint datasets) should be useful in copyright legal cases involving LLM training data.

#MLsec

https://berryvilleiml.com/references/

Scalable Extraction of Training Data from (Production) Language Models

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

arXiv.org

Ecma TC54 will be working towards standardizing Package URL, specifically purl, vers, and purl types. TC54 will be working out the details over the next few weeks. We invite everyone to learn more about TC54 and contribute to the advancement of Package URL.

Additionally, a #purl channel is now available on the CycloneDX Slack workspace where nearly 2K people are discussing SBOM and related technologies. Slack information and invite located on the TC54 website.

https://tc54.org/

#PackageURL #SBOM #OWASP #CycloneDX

Ecma TC54 | Software and System Transparency

Ecma Technical Committee 54 is chartered to standardize the OWASP CycloneDX Bill of Materials specification, standards and algorithms that advance transparency and identity, and the sharing of transparency information across the supply chain.

Join us for our next community meeting on January 31st at 3PM UTC (10:00am U.S. Eastern)!

Agenda:
- Project / Release Update (~15 min)
- How IBM CISO uses Dependency-Track (~10 min)
- Q&A (~30 min)

Calendar invite:
https://calendar.google.com/calendar/event?eid=MHE1ZXVpczAyZnAyOG9wcGluOTZiczI2NDEgY184ODRkZWNkZTVhMTUyOTAyYmI1MWE2MmY4OTU1MGQwZjM3NDg0ODQ1MzRmMDhjNjM3OTJmMmU2NTRmMmE3ZWJjQGc&ctz=UTC

#OWASP #SBOM #CycloneDX #EO14028

Dependency-Track Community Meeting