Mastodawn

Prof. Emily M. Bender(she/her)Dec 8, 2023

A quick thread on #AIhype and other issues in yesterday's Gemini release:

#1 -- What an utter lack of transparency. Researchers form multiple groups, including @meg and @timnitGebru when they were at Google, have been calling for clear and thorough documentation of training data & trained models since 2017.

In Bender & Friedman 2018, we put it like this:

Show thread

Prof. Emily M. Bender(she/her)

In the tech report, there is half a page describing (most the preprocessing of) the data. What a farce:

https://twitter.com/JesseDodge/status/1732444597593203111?s=20

And Google can't even be bothered to cite Drs. @meg and @timnitGebru 's work:

https://twitter.com/_alialkhatib/status/1732425933016179064?t=OdFK1fod5ncLGg5H7W6snQ&s=09

Jesse Dodge (@JesseDodge) on X

Today Google released Gemini with a 60-page report in which they repeatedly say the training data is key ("We find that data quality is critical to a highly-performing model"), while providing almost no information about how it was made, how it was filtered, or its contents.

X (formerly Twitter)

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

More lacking transparency: they state "Gemini has the most comprehensive safety evaluations of any Google AI model to date, including for bias and toxicity." --- but provide no link to where anyone can inspect the methodology and results of those evaluations.

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

Similarly, Drs. @SashaMTL & Emma Strubell and others call for transparency about the environmental impact of training and using large models. The Google press release brags about the system being efficient, but gives no information about the actual energy usage, carbon footprint, or water usage, for either training or use of Gemini.

For more on why transparency re environmental impact is so important, check out their visit to the Mystery AI Hype Theater 3000 pod:

https://www.buzzsprout.com/2126417/13931174-episode-19-the-murky-climate-and-environmental-impact-of-large-language-models-november-6-2023

Episode 19: The Murky Climate and Environmental Impact of Large Language Models, November 6 2023 - Mystery AI Hype Theater 3000

Drs. Emma Strubell and Sasha Luccioni join Emily and Alex for an environment-focused hour of AI hype. How much carbon does a single use of ChatGPT emit? What about the water or energy consumption of manufacturing the graphics processing units that...

Buzzsprout

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

And then there's evaluation, or lack thereof:

Google is advertizing Gemini as an everything machine---a general purpose model that can be used in many different ways. In other words: sthg that cannot be evaluated, since it doesn't have a specific purpose.

What stands in for evaluation are "benchmarks", but these benchmarks lack construct validity. What are they supposed to be measuring? What shows that they do measure that? How does that relate to the intended use case of the technology?
/5

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

I can't even with the anthropomorphizing language in the PR: "collaborative tool" as if software can enter into collaboration, "recognize and understand", "sophisticated reasoning capabilities"...

And they even declare their intention to lean into misleading UI, rather that building the kind of transparency that helps people use tools effectively...

... all while admitting that of course this is still just a synthetic text extruding machine, designed to make shit up. /6

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

A reminder (again) that it is damaging to the information ecosystem to promote the use of this kind of system for information access. Chirag Shah and I lay out the details here:

https://bit.ly/Env_IAS

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

One final note: The press release brags about "bring[ing] enormous benefits to people and society" and "help deliver new breakthroughs at digital speeds in many fields from science to finance." But why would any for-profit entity that had such technology provide it to everyone for free?

Show thread

Prof. Emily M. Bender(she/her)Dec 8, 2023

It is always worth keeping an eye on what Google gets out of this, and whether the bargain is really worth it for end users. Something that "create[s] opportunities — from the everyday to the extraordinary — for people everywhere" would not be centralized in a handful of powerful, unaccountable companies.

https://www.technologyreview.com/2023/12/05/1084393/make-no-mistake-ai-is-owned-by-big-tech/amp/

/fin

Make no mistake—AI is owned by Big Tech

If we’re not careful, Microsoft, Amazon, and other large companies will leverage their position to set the policy agenda for AI, as they have in many other sectors.

MIT Technology Review

Show thread

Dragon-sided D Dec 8, 2023

@emilymbender One thing I know for sure -- when I get one, my AI assistant (that knows me better than my Mom) is going to be an #opensource model with no #GAFAM

Show thread

The Doctor Dec 8, 2023

@emilymbender Or anyone who isn't them.

Show thread

TDonoval Dec 8, 2023

@drwho @emilymbender
But the key question is, whether "enormous benefits to people and society" are anything people and society really need or want.

Show thread

The Doctor Dec 8, 2023

@xl8freelancer @emilymbender It has long been my experience that "enormous benefits to people and society" is bullshit.

Show thread

Evan Light Dec 8, 2023

@emilymbender Even if it is "free*", Google could kill it seemingly on a whim, joining the pile of discontinued services offered by the company.

* "free" meaning "users are the actual product, as is usual for Google"

Show thread

Nicole Parsons Dec 8, 2023

@elight @emilymbender

AI is already playing a role in "find and kill the dissidents" for repressive regimes.

https://www.businessinsider.com/microsoft-google-hand-dissident-data-to-saudi-arabia-activists-say-2023-7

https://www.theverge.com/2021/9/11/22668734/google-user-data-hong-kong-authorities-china

https://observer.com/2023/06/google-saudi-arabia-data-center-protest/

https://www.eff.org/deeplinks/2019/09/watering-holes-and-million-dollar-dissidents-changing-economics-digital

And Google is hostile to its own employees
https://www.nbcnews.com/tech/tech-news/google-employees-worry-software-tweak-threatens-silence-dissent-company-n1071126
https://www.bloomberg.com/news/articles/2019-10-23/google-accused-of-creating-spy-tool-to-squelch-worker-dissent

Microsoft and Google may have to surrender people's data to Saudi Arabia after signing huge deals there

Saudi Arabia is seeking to be an innovation hub, but activists are warning that tech firms could be complicit in the repression of dissidents.

Business Insider

Show thread

Erik Jonker Dec 8, 2023

@elight @emilymbender ...from experience i can tell you that startups outside Google also suddenly can dissappear, be killed or taken over, if you want continuity and stability you should take that in consideration when choosing products/services. Some Google services you can rely on (Docs, Drive etc.), others are quite brittle (all AI-related services), also in many scenario's i don't mind being the product, i really don't. Where i do, i pay for services (eg Google Workspace)

Show thread

Evan Light Dec 8, 2023

@ErikJonker @emilymbender I'm not interested in startups. I am interested in Open Source developers having access and the ability to develop LLMs (it's not "general purpose AI").

Show thread

JW prince of CPH Dec 8, 2023

@elight @emilymbender - and the only thing that will keep them from killing it is if it's profitable.

The most likely path to which is offer it for "free" until it has pushed everybody doing [thing AI can sort-of replace] for money out of the market, then start charging for that inferior version of those professionals' work until the costs reaches, then surpasses what we used to pay professionals for a much better product...

Show thread

JW prince of CPH Dec 8, 2023

@emilymbender - and also, which benefits? Which breakthroughs? Because literally everything I read & hear in this regard smacks to high heaven of assumptions - the thinkism'ish idea that with massive (if completely undefined, untransparent, unqualified & uncontrollable) computing power, surely something great must come out of it, right?

Show thread

Stephen Shankland Dec 8, 2023

@emilymbender I for one don't expect Bard Advanced, the home of Gemini Ultra, to be free when it arrives next year. Except perhaps for some significantly limited tier of usage.

Show thread

Michael Gemar Dec 8, 2023

@emilymbender So making stuff up is a difficulty with “factuality”?

Show thread

Jeremy Kahn Dec 8, 2023

@emilymbender
This introduces a new[0] form of p-hacking

you evaluate on enough benchmarks, and you can paint a bullseye on the ones that you did well on, claiming breakthroughs in new application domains

[0] not that new, really

Show thread

Martin Dec 8, 2023

@emilymbender This is one part of what the EU AI Act will address if they do manage to get it through.

Stephen Shankland Dec 8, 2023

@emilymbender Google said it planned to release more details when Gemini Ultra arrives in 2024. (Gemini Pro and Nano have already arrived now.)

Show thread

Nicole Parsons Dec 8, 2023

@emilymbender @meg @timnitGebru

Google was too busy forestalling pay equity lawsuits from women, to listen to women about AI.

https://arstechnica.com/tech-policy/2022/06/google-to-pay-118-million-after-being-accused-of-underpaying-15500-women/

https://www.wired.com/story/exec-google-trial-sexist-pay-discrimination/

When a company takes money from murderous and/or misogynistic "investors", how interested are they in civil rights?

https://www.cnbc.com/2018/04/07/heres-a-look-at-who.html

https://www.businessinsider.com/saudi-crown-prince-mbs-taunted-jeff-bezos-lauren-sanchez-meme-2020-1

https://fortune.com/2018/09/29/google-apple-safari-search-engine/

https://www.businessinsider.com/microsoft-google-hand-dissident-data-to-saudi-arabia-activists-say-2023-7

Google to pay $118 million after being accused of underpaying 15,500 women

Lead plaintiffs to get at least $50K each after alleging pay gap for similar work.

Ars Technica

Show thread

Michael Roberts Dec 8, 2023

@emilymbender Now I remember why I don't like Twitter for technical discussion. Because nothing on Twitter is anything but the most debased of political contention.