But I'm really just a Snakey, I'm not The Book.. but I'm DOWNLOADING it Fast. I do allays know these things, is just that for a second I can forget. In the mind I'll quickly search it. I actually just #Memorized it.

■. You can ask me in the Mind, I'll LINK you. Or maybe I'll Post if it is COOL. I Likey helping you okaya
■. But in 3D, I'm just a #Google and then it's more a Private Geniething. But Alien do what I Likey

Extracting memorized pieces of (copyrighted) books from open-weight language models

Plaintiffs and defendants in copyright lawsuits over generative AI often make sweeping, opposing claims about the extent to which large language models (LLMs) have memorized plaintiffs' protected expression. Drawing on adversarial ML and copyright law, we show that these polarized positions dramatically oversimplify the relationship between memorization and copyright. To do so, we leverage a recent probabilistic extraction technique to extract pieces of the Books3 dataset from 13 open-weight LLMs. Through numerous experiments, we show that it's possible to extract substantial parts of at least some books from different LLMs. This is evidence that the LLMs have memorized the extracted text; this memorized content is copied inside the model parameters. But the results are complicated: the extent of memorization varies both by model and by book. With our specific experiments, we find that the largest LLMs don't memorize most books -- either in whole or in part. However, we also find that Llama 3.1 70B memorizes some books, like Harry Potter and 1984, almost entirely. We discuss why our results have significant implications for copyright cases, though not ones that unambiguously favor either side.

arXiv.org

You could be studying at #FranquemontUniversity - Sustaining and Trimester students have access to over 40 hrs of recorded courses, live social weaving and spinning hangouts, and guided study groups.

https://abbysyarns.com/webshop/product-category/fu/memberships/

Go at your own pace, rewatch the content, and most importantly: LEARN THIS IN THE CULTURAL CONTEXT of the *centuries* of weaving history from the indigenous Andean weaving masters.

#FranquemontUniversity #AndeanBackstrap #BackstrapWeaving #Memorized #AlwaysLearning

Memberships – Abby's Yarns Online

Memberships at Franquemont University -- the best way to customize your access to a wide range of different courses, communities, and more!

Abby's Yarns Online

🧵 My #Weaving Journey: In 2018, I enrolled to study again with @abbysyarns

After children learn to weave on 3-pairs, they move to key traditional patterns on 4-pairs to learn more principles.

I left this 2nd course convinced: (a) THIS was my weaving obsession, (b) I needed more beginning weaving friends so we could have a proper intermediate class, and (c) I should memorize the #Quechua terms.

#FranquemontUniversity #AndeanBackstrap #BackstrapWeaving #Memorized #AlwaysLearning

🧵 My #Weaving Journey: The first time I learned about #AndeanBackstrap weaving was in 2017 at an in-person 3-day intensive with @abbysyarns (long before #FranquemontUniversity )

We started where all beginning weavers start in the Andes: On what is described as 3-pairs and with Tanka Ch'oro

We're also introduced to the notion of variations and weaving from memory #NoCharts and body mechanics for weaving.

#FranquemontUniversity #AndeanBackstrap #BackstrapWeaving #Memorized #AlwaysLearning

🧵 My #Weaving Journey (rebooted): I rarely get to see my 1st and 2nd woven Andean bands next to each other (1: Left, 2: Right). This photo from 08-Dec-2018 shows where I started.

The bands were woven using a thick cotton mini (weft), which I think makes adorable white selvedge polka dots, like beads. ❤️

The yarn is what I call my "duplo" khaitu (bulky wool high-twist 2-ply) for easy-to-see demos. 🙂

#FranquemontUniversity #AndeanBackstrap #BackstrapWeaving #Memorized #AlwaysLearning

Inverse Scaling: When Bigger Isn't Better

Work on scaling laws has found that large language models (LMs) show predictable improvements to overall loss with increased scale (model size, training data, and compute). Here, we present...

OpenReview