⚡️New! Opening up ChatGPT
https://dl.acm.org/doi/10.1145/3571884.3604316 (  https://doi.org/10.48550/arXiv.2307.05532 )

TL;DR #chatgpt is unfit for responsible use in science & education. Open instruction-tuned text generators (LLM + RLHF) are on the rise. But how open are they? We track degrees of openness in this #CUI23 paper & live at https://opening-up-chatgpt.github.io

We hope this paper & repository will help more people make mindful choices about tech

work with @andreasliesenfeld and @alianda

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators | Proceedings of the 5th International Conference on Conversational User Interfaces

ACM Conferences

This paper got more radical as we wrote (and rewrote) it. It started out of a need for a constructive & systematic overview of open LLM+RLHF alternatives. See our repo! https://opening-up-chatgpt.github.io/

Credits to @andreasliesenfeld for coming up with the idea and for observing that the open LLM+RLHF landscape was growing so rapidly that it would be worth a systematic survey.

Looking at all these initiatives made the OpenAI playbook stand out as so cynical & harmful, we had to call it out in print

Opening up ChatGPT: LLM openness leaderboard

We were impressed by the transparency of the BLOOMZ/xmtf models, clearly designed from start to finish with openness and accountability in mind. This is the only player checking virtually *all* the boxes: data collection & curation, training regime, instruction-tuning, codebase, model cards, data sheets: everything meticulously documented and organized. You can look it all up on our site or directly in the repo: https://github.com/opening-up-chatgpt/opening-up-chatgpt.github.io/blob/main/projects/bloomz.yaml
opening-up-chatgpt.github.io/projects/bloomz.yaml at main · opening-up-chatgpt/opening-up-chatgpt.github.io

Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In ACM Conference on C...

GitHub

And right on cue, Meta releases #llama2. They call it "open source" but it really is not in any sense of the term. All they make available is model weights. The source (training data, RLHF data, code) is nowhere to be found and the whole thing is about the furthest from true FOSS you can get. It ends up on the lowest rung in our overview, just one notch above chatgpt: https://opening-up-chatgpt.github.io

The real question reporters should be asking: What's in it for Meta?

Opening up ChatGPT: LLM openness leaderboard

Some people were using OpenAI's advanced API parameters apparently to derive scientific insights about GPT3.5. This will break in under a week. Thank you OpenAI for making our point so eloquently and efficiently for us

(first image: introduction of our CUI'23 paper https://doi.org/10.1145/3571884.3604316 ; second image: OpenAI announces it will break getting information about prompt/output probabilities for whatever reason)

#OpeningUpChatGPT #proprietary #reproducibility #openscience

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators | Proceedings of the 5th International Conference on Conversational User Interfaces

ACM Conferences
@dingemansemark thanks to RHEL for redefining open source, eh?
@Mark Dingemanse I can see no section in the chart called "respecting existing sources copyright". That's bad.

@jrp excellent point, but hard to track for our small team. This is why we write in the paper "It should be noted ... claims of openness do not cancel out problems, legal or otherwise" (intro), single out "inheritance of documented data ... of dubious legality" as a problem (results) and why we point out some of the "legal quagmires" related to licensing (discussion) https://doi.org/10.48550/arXiv.2307.05532

The repo & all materials are open so others can build on this & extend it

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators

Large language models that exhibit instruction-following behaviour represent one of the biggest recent upheavals in conversational interfaces, a trend in large part fuelled by the release of OpenAI's ChatGPT, a proprietary large language model for text generation fine-tuned through reinforcement learning from human feedback (LLM+RLHF). We review the risks of relying on proprietary software and survey the first crop of open-source projects of comparable architecture and functionality. The main contribution of this paper is to show that openness is differentiated, and to offer scientific documentation of degrees of openness in this fast-moving field. We evaluate projects in terms of openness of code, training data, model weights, RLHF data, licensing, scientific documentation, and access methods. We find that while there is a fast-growing list of projects billing themselves as 'open source', many inherit undocumented data of dubious legality, few share the all-important instruction-tuning (a key site where human annotation labour is involved), and careful scientific documentation is exceedingly rare. Degrees of openness are relevant to fairness and accountability at all points, from data collection and curation to model architecture, and from training and fine-tuning to release and deployment.

arXiv.org
@dingemansemark @andreasliesenfeld @alianda Have been looking for something like this. Thank you!

@dingemansemark @andreasliesenfeld @alianda @mmin

Great visual on the various models transparency - or lack thereof…

@dingemansemark @andreasliesenfeld @alianda Great and useful work! The interprovincial digital ethics committee released an advice wrt the use of ChatGPT by provinces https://www.ipo.nl/nieuws/advies-over-het-gebruik-van-chatgpt-in-provincies/ (PDF https://www.ipo.nl/media/5u4gjtr5/ipo-whitepaper-verkenning-chatgpt.pdf ) I think your work is very useful for improving the section on technological alternatives or lack thereof.
Advies over het gebruik van ChatGPT in provincies

@ton Interessant! — en inderdaad, ons werk is hier zeer relevant. Zo zou ik bijvoorbeeld vermijden Llama2 "open source" te noemen: werkelijk niets van de source is open (code, trainingsdata, finetuning); het enige dat gedeeld is zijn de zg. model weights, die (zo geeft Meta zelf toe) met 1 miljoen Meta-specifieke instructies gefinetuned zijn zonder dat gebruikers inzage krijgen in wat dat precies betekent (bias? marketing? beïnvloeding? alles is mogelijk)
@dingemansemark De intentie met dat IPO document is het wanneer relevant te actualiseren. In de IPO ethische commissie zitten oa Jeroen v.d. Hoven v TUD, Esther Keymolen (Tilburg), Roel Dobbe (TUD) en Erna Ruijer (Utrecht) uit academische hoek. Op 21-9 organiseren we een bijeenkomst v PhD studenten in Utrecht over m.n. governance / toewijzing verantwoordelijkheden bij inzet van AI/algo's. Wellicht is er ook een promovendus bij Radboud voor wie dat interessant is?

@dingemansemark @andreasliesenfeld @alianda @mmin

Did you look at dall-e 2? Palm?

@tchambers @andreasliesenfeld @alianda @mmin LLM + RLHF architectures only, and ones that claim to be open source only

Dalle2 is images, Palm is text but neither open nor RLHF (instruction-tuned)

@dingemansemark @andreasliesenfeld @alianda I appreciate what you've done here, thanks! Systems that are closed off and proprietary are really risky to depend on, but I wonder whether openness may make it easier for a malicious actor to take the "RLHF mask" off?

@dingemansemark @andreasliesenfeld @alianda

I found this work just in time to mention it in a keynote talk at the #Munin2023 conference on scholarly publishing! LOVE IT and thank you so much for doing this really important auditing! 🙏

@KirstieJane @andreasliesenfeld @alianda Thank you for sharing! Very happy the work is proving to be useful