How Hyper-Datafication Impacts the Sustainability Costs in Frontier AI
Sophia N. Wilson, Sebastian Mair, Mophat Okinyi, Erik B. Dam, Janin Koch, Raghavendra Selvan

https://arxiv.org/abs/2602.00056

#AI
#thereIsNoAI
#thereIsInParticularNoSustainableAI

How Hyper-Datafication Impacts the Sustainability Costs in Frontier AI

Large-scale data has fuelled the success of frontier artificial intelligence (AI) models over the past decade. This expansion has relied on sustained efforts by large technology corporations to aggregate and curate internet-scale datasets. In this work, we examine the environmental, social, and economic costs of large-scale data in AI through a sustainability lens. We argue that the field is shifting from building models from data to actively creating data for building models. We characterise this transition as hyper-datafication, which marks a critical juncture for the future of frontier AI and its societal impacts. To quantify and contextualise data-related costs, we analyse approximately 550,000 datasets from the Hugging Face Hub, focusing on dataset growth, storage-related energy consumption and carbon footprint, and societal representation using language data. We complement this analysis with qualitative responses from data workers in Kenya to examine the labour involved, including direct employment by big tech corporations and exposure to graphic content. We further draw on external data sources to substantiate our findings by illustrating the global disparity in data centre infrastructure. Our analyses reveal that hyper-datafication does not merely increase resource consumption but systematically redistributes environmental burdens, labour risks, and representational harms toward the Global South, precarious data workers, and under-represented cultures. Thus, we propose Data PROOFS recommendations spanning provenance, resource awareness, ownership, openness, frugality, and standards to mitigate these costs. Our work aims to make visible the often-overlooked costs of data that underpin frontier AI and to stimulate broader debate within the research community and beyond.

arXiv.org
»Det bliver et NEJ TAK. Det kan du godt skrive med stort«

Datacentre vil i fremtiden bruge en stor del af Danmarks grønne strøm. Ifølge nye tal er 30 på vej over hele landet – men flere steder er borgere begyndt at protestere

Information
Copy Fail — 732 Bytes to Root

CVE-2026-31431. 100% Reliable Linux LPE — no race, no per-distro offsets, page-cache write that bypasses on-disk file-integrity tools and crosses containers. Found by Xint Code.

Xint

Clearly, humanity needed "AI"

""There is compelling and concerning #data that explicit deepfakes have increased on the #internet as much as 550% year on year since 2019," Julie Inman Grant wrote after advising parliament on the new laws in 2024.

"It's a bit shocking to note that #pornographic videos make up 98% of the #deepfake material currently online and 99% of that imagery is of women and girls.""

https://www.bbc.com/news/articles/c39333x0xeno

#thereisNoAI

Australian pleads guilty to creating deepfake porn in landmark case

The 19-year-old is the first person to be charged under a new national law.

Always a favorite - FruitML = teaching simple #TinyML with #fruit detection as the task, in our #IoT course

#yolo #mobileNets #InternetOfThings #MachineLearning #EdgeImpulse #TeachableMachines

#thereIsNoAI
#thereIsInParticularNoSustainableAI

but some tiny ML is fun

#whoNeedsDataCenters

"AI" synthesizing conference sessions - that's a first time for me seeing that oneoffered.
We basically dont need conferennces anymore -
we can have "AI"s hallucinate our favorite viewpoints on demand.

" Subscribers to the Premium Edition have access to enhanced features and productivity tools

Access new AI features such as short synopses, article summaries, content recommendations, and podcasts synthesizing conference sessions

https://dl.acm.org/about/upgrade

@ACM #acm

#thereIsNoAI

#Meta, #Google and #Microsoft have all baked [generative AI] deep into their systems,” Joshi says. “I see this all as very much part of the tactic of trying to embed these systems into society and instil dependency in a fashion similar to the growth of single-use plastics in the 1970s.”

https://www.theguardian.com/australia-news/2026/mar/13/ai-datacentres-environmental-impacts

#thereIsNoAI #QuitGPT

The environmental cost of datacentres is rising. Is it time to quit AI?

As the QuitGPT movement gains momentum, should people concerned about the environmental impacts of AI consider opting out?

The Guardian

@thomasfuchs
while this is an expected and legitimate rhetorical move, the thing is that we actually know quite a bit about human consciousness - if mostly by exclusion.

We know it does not make the key mistake of digitizing,
we know there s no storage medium,
it does not scrape,
it does not 'improve' by throwing more MegaWatts at itself, but instead runs just fine on 10-12 W.
So we do know it is fundamentally different.

#thereIsNoAI
#ClaudeIsUnconscious
#capitalismIsStupid

The general failure that is Trump.
The general failure that is #AI.

Brute force AI's surging need for #hardware drives US #tradeDeficit to new heights.

Not only factor of course.
#China also now unrivaled as leader of the #energy #transition.

https://www.bbc.com/news/articles/c4ge4yxwnlno

#thereIsNoAI

US trade deficit hits fresh high despite Trump's tariffs

The US bought more goods than it sold in 2025 as the White House attempts to reverse the flow.