RE: https://social.opensource.org/@osi/116676168830405610

Working on this has been an incredible experience! I wanted to share a bit of what I've been doing (and some behind the scenes stuff)! ⬇️

#OSI #Diplomacy #G7 #AI #OpenSource

In February of this year, we got an interesting call from the French Presidency of the G7: they had noticed there was a lack of clarity around AI openness, and they wanted to do something about it.

Usually, they would rely on an institutional knowledge partner for this, such as the amazing #OECD, but they recognised Open Source was a community-led movement, and wanted a Community organisation as part of negotiations. They chose the #OSI. ⬇️

This sort of thing can often be quite a difficult call: it involves coming to the negotiating table with no veto, and only limited control over the outcome, but the French Presidency was very keen, and willing to support us.

We had an initial discussion where we discussed what outcomes we could support. We laid out our red lines (Open Washing) and found a lot of areas where we agreed. ⬇️

In advance of the first in person meeting in Paris, we immediately got to work supporting the French Presidency with drafting, but were also concious that while the #OSAID had garnered support, it was not unanimous, and so it was important to bring the voices of all #OpenSource communities to the table. ⬇️

As I am based in #Brussels, and Paris is a short train journey away, I was sent as #OSI's representative in negotiations. (I'm also trilingual (πŸ‡¬πŸ‡§ πŸ‡«πŸ‡· πŸ‡©πŸ‡ͺ ) which helps!)

Negotiations took place at the Ministry of Economy, not far from where I used to live in Paris.

Unrelated trivia: The building also has an interesting feature: it has pipes that transport liquid soap to all bathrooms. ⬇️

For each of the three in-person sessions in Paris, negotiations spanned two days (although the final round of negotiations spanned 3 and went on late into the night). These were augmented by multiple online sessions.

My European Parliament experience made this easier.

The French Presidency were extremely gracious hosts, and worked closely with us. We were lucky to work with a Presidency that cared so deeply about Community views.

They also organised lovely dinners for the negotiating teams. ⬇️

I'd also add that, while we had some disagreements with some delegations, I'm delighted to report that all delegations took our concerns seriously, regularly taking them back to their capitals.

On a personal level, everyone was also really nice (especially the French Presidency).

At a negotiating table where we had no formal power, we were treated as genuine partners, which was lovely.

I'll get on to the outcome in a second, but there is one more nice surprise to get into: as we got towards the end of the process, the French Presidency asked me if the OSI's Executive Director would be willing to address the #G7 Ministers, outlining the work we've been doing and the benefits of Openness.

We naturally accepted, and #OSI's new #ED found himself giving a speech in a suit around the table with Government Ministers of G7 Members!

Negotiations wrapped up with a lovely dinner organised by the French Presidency in the Garden of the MusΓ©e Rodin, which is a real treat for anyone who loves sculptures (or Champagne!)

So what's the result? Let's break it down starting with principles:

πŸ«‚ AI Openness is Community Driven: Governments can't make decisions about Openness alone, it's built by Community Concensus over time. It may still evolve yet.

🌈 AI Openness is a Spectrum. OSI doesn't usually like to talk about a "Spectrum of Openness" (it's more a spectrum of closedness) but we picked our fights here. Openness can be a spectrum as long as Open Source begins at a fixed point on that Spectrum!
⬇️

🧱 AI Openness is determined by multiple elements: it's not just the model itself, it's training code and data too!

🏷️ AI Openness should be described with proper labels: words like "Open Weights AI" and "Open Source AI" mean something and come with requirements!
⬇️

Then we get to the "Common language": the terms that are defined by the text.

OSI provided an initial recommendation on all this common language, which the French Presidency used to build the first draft. We worked with them closely to refine it over the course of the negotiations.

It creates four categories: "Weights Available", "Open Weights", "Open Source" and "Open Source and Open Data".⬇️

We've noted a significant part of Open Washing in the AI space is models with proprietary licences claiming they are Open Weights.

This undermines the term "Open Weights" so our first step was to propose a category below that, and require that Open Weights AI be published under an Open Source Licence.

While Open Weights Models aren't Open Source, we believe clarity on terms is still important, and when models are distributed under Open Source licences, they still give the user more freedom.

⬇️

When it came to Open Source AI, the #OSI's #OSAID was the inspiration for the work. That means models’ weights,
deployment code, training code, and at least data information (but preferably training data).

I defended the #OSI view that Data Information is a necessary compromise for cases where data sharing is impossible for technical or legal reasons (copyright, data protection etc..), to ensure legal certainty.

But also shared that a part of the community has reservations about the approach⬇️

A single delegation found the OSAID too weak, while others thought it was too strong, and opposed requiring data information even when Training Data is provided.

We compromised:

Data information was made mandatory only where Training data is not available.

Providing training data was made mandatory except where legally or technically impossible.

A "Open Source and Open Data" label was created for models where all data can be published by the model developer.

⬇️

Some might ask why we are accepting something that deviates from the #OSAID.

We are doing so for the same reason we made the #OSAID, despite knowing the first draft would likely not be perfect: because something is needed now, and this is much better than the status quo (government saying nothing).

Even if this definition, (or even the OSAID) aren't perfect, they fundamentally strengthen protection against Open Washing, and enhance the four freedoms. And that is our mission 🫑.

In these negotiations, I think we really managed to punch above our weight and get a result that is great for Open Source Communities.

I'm also massively grateful to the French Presidency of the G7 for the support and backing they provided us.

At a time when a lot of people are understandably cynical and despondent about politics, I think it is important to remember there are some amazing people doing good work right now.

Working alongside them on this was quite the adventure! πŸ’«

Finally, I'm looking forward to getting back to sleeping a reasonable number of hours, and not having intense negotiations in Paris every month.

But the truth is the next thing is likely just around the corner: Open Source is under the geopolitical spotlight right now, and it needs defending more than ever.

If you want to support my work in doing that, consider joining, donating to or sponsoring the OSI so we can keep doing what we do :)

Jordan Out🫑

https://opensource.org/get-involved

Get involved

Get involved Hundreds of individuals and organizations worldwide join as members and support as donors or sponsors of the OSI. They trust in our neutral stewardship of open source licensing and our…

Open Source Initiative

@jmaris
Thanks for being there and defending the term "Open Source"!

Though I wonder: was there any talk about AI License Laundering, where a model is trained on FOSS software regardless of license, and then used to produce functionally equivalent software with no license / a different license?

Or more generally about ways for authors, website operators, etc. to opt out from their work being used as AI training data?

@jmaris
Like, it seems to me that as long as AI companies can ignore licenses, ToS, and robots.txt, circumvent user-agent bans and IP address bans, and just scrape all human expression available on the internet until the websites hosting it grind to a halt, and then use that data to train their models to compete with the humans that they got the data from, then the open-source community cannot survive.

@wolf480pl @wolf480pl This is a great question, and something i'm pretty worried about. Although in recent weeks, some of my concerns have dissipated. What i have noticed is that licence laundering isn't effective because the benefit of using Open Source in the first place is the community, support and updates.

AI laundered code doesn't get that.

Not to mention, vibe coding doesn't build viable apps.

@jmaris yeah maybe it's not that big of a deal for complex community-maintained software projects.

It does hurt other parts of open internet - like knowledge-sharing forums (eg. stackoverflow, some subreddits, Citroen owners' forum, Half-Life 2 modding forum, etc) where nobody is going to contribute if ChatGPT knows the answer because it scraped it from those forums.

Though I guess that is out of scope for OSI.

@jmaris I'm very guilty of equating EU = nihilistic misanthropic cynicism of all parties but it's great to hear that there is some glimmer of activity of lessening the harms. Hats off for the good work!
@jmaris what about Maple AI?
@F100 I couldn't find much information, but it looks to me like Open Weights AI (don't quote me on that).