In our own work, we researched memorization in language models for code and ways to let them regurgitate training data:

> From the training data that was identified to be potentially extractable we were able to extract 47% from a CodeGen-Mono-16B code completion model.

> We also observe that models memorise more, as their parameter count grows, and that their pre-training data are also vulnerable to attack

https://dl.acm.org/doi/abs/10.1145/3597503.3639133

#memorization #atemlos

Traces of Memorisation in Large Language Models for Code | Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

ACM Conferences

Urteil GEMA gegen Open AI:

> Sowohl durch die Memorisierung in den Sprachmodellen als auch durch die Wiedergabe der Liedtexte in den Outputs des Chatbot lägen Eingriffe in die urheberrechtlichen Verwertungsrechte vor

https://www.justiz.bayern.de/gerichte-und-behoerden/landgericht/muenchen-1/presse/2025/11.php

#atemlos #openai #copyright #memorization #gema #chatgpt

Pressemitteilung 11/2025 - Bayerisches Staatsministerium der Justiz

From Memorization to Reasoning in the Spectrum of Loss Curvature

We characterize how memorization is represented in transformer models and show that it can be disentangled in the weights of both language models (LMs) and vision transformers (ViTs) using a decomposition based on the loss landscape curvature. This insight is based on prior theoretical and empirical work showing that the curvature for memorized training points is much sharper than non memorized, meaning ordering weight components from high to low curvature can reveal a distinction without explicit labels. This motivates a weight editing procedure that suppresses far more recitation of untargeted memorized data more effectively than a recent unlearning method (BalancedSubnet), while maintaining lower perplexity. Since the basis of curvature has a natural interpretation for shared structure in model weights, we analyze the editing procedure extensively on its effect on downstream tasks in LMs, and find that fact retrieval and arithmetic are specifically and consistently negatively affected, even though open book fact retrieval and general logical reasoning is conserved. We posit these tasks rely heavily on specialized directions in weight space rather than general purpose mechanisms, regardless of whether those individual datapoints are memorized. We support this by showing a correspondence between task data's activation strength with low curvature components that we edit out, and the drop in task performance after the edit. Our work enhances the understanding of memorization in neural networks with practical applications towards removing it, and provides evidence for idiosyncratic, narrowly-used structures involved in solving tasks like math and fact retrieval.

arXiv.org
The New York Times thinks a turtle poem will "win your heart" 🐢💔—because nothing screams "captivating" like slow-moving reptiles and deep dives into poetic gravity. 🎼✨ Meanwhile, they offer a #game to help memorize it, as if anyone is clamoring to recite turtle verses at parties. 🎉📜
https://www.nytimes.com/interactive/2025/06/12/books/kay-ryan-turtle-poem.html #turtlepoem #NewYorkTimes #poetry #memorization #heartwarming #HackerNews #ngated
Slow and Steady, Kay Ryan’s “Turtle” Poem Will Win Your Heart

A.O. Scott ponders the specific gravity and unlikely grace of Kay Ryan’s “Turtle.” And we have a game to help you memorize it.

The New York Times
Interesting, "GPT-style models have a fixed memorization capacity of approximately 3.6 bits per parameter."
https://venturebeat.com/ai/how-much-information-do-llms-really-memorize-now-we-know-thanks-to-meta-google-nvidia-and-cornell/
#ai #memorization #llm
How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell

Using a clever solution, researchers find GPT-style models have a fixed memorization capacity of approximately 3.6 bits per parameter.

VentureBeat
How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell https://venturebeat.com/ai/how-much-information-do-llms-really-memorize-now-we-know-thanks-to-meta-google-nvidia-and-cornell/ #AI #memorization #copyright
How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell https://venturebeat.com/ai/how-much-information-do-llms-really-memorize-now-we-know-thanks-to-meta-google-nvidia-and-cornell/ #AI #memorization #copyright
【生成AIパスポート試験対策】 GPTs - エミリーと学ぶ生成AIの世界

「【生成AIパスポート試験対策】 GPTs」 のご紹介です。 この試験に関する生成AI動画が、AIを…

エミリーと学ぶ生成AIの世界

A quotation from Montaigne

I gladly return to the subject of the ineptitude of our education. Its goal has been to make us not good or wise, but learned; it has attained this goal. It has not taught us to follow and embrace virtue and wisdom, but has imprinted in us their derivation and etymology. We know how to decline virtue, if we cannot love it. If we do not know what wisdom is by practice and experience, we know it by jargon and by rote.
 
[Je retombe volontiers sur ce discours de l’ineptie de nostre institution : Elle a eu pour sa fin, de nous faire, non bons & sages, mais sçavans : elle y est arrivée. Elle ne nous a pas appris de suyvre & embrasser la vertu & la prudence : mais elle nous en a imprimé la derivation & l’etymologie. Nous sçavons decliner vertu, si nous ne sçavons l’aymer. Si nous ne sçavons que c’est que prudence par effect, & par experience, nous le sçavons par jargon & par cœur.]

Michel de Montaigne (1533-1592) French essayist
Essay (1578), “Of Presumption [De la Presomption], Essays, Book 2, ch. 17 (2.17) (1595) [tr. Frame (1943)]

Sourcing, notes, alternate translations: wist.info/montaigne-michel-de/…

#quote #quotes #quotation #qotd #montaigne #education #learning #meaning #memorization #morality #rote #school #understanding #virtue #wisdom