Parameter Lab

5 Followers
5 Following
28 Posts
Empowering individuals and organisations to safely use foundational AI models.
Webhttps://parameterlab.de
LocationTรผbingen, Germany

๐Ÿ”Ž Wonder how to prove an LLM was trained on a specific text? The camera ready of our Findings of #NAACL 2025 paper is available!
๐Ÿ“Œ TLDR: longs texts are needed to gather enough evidence to determine whether specific data points were included in training of LLMs: https://arxiv.org/abs/2411.00154

https://sigmoid.social/@parameterlab/113882723611613838

Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Membership inference attacks (MIA) attempt to verify the membership of a given data sample in the training set for a model. MIA has become relevant in recent years, following the rapid development of large language models (LLM). Many are concerned about the usage of copyrighted materials for training them and call for methods for detecting such usage. However, recent research has largely concluded that current MIA methods do not work on LLMs. Even when they seem to work, it is usually because of the ill-designed experimental setup where other shortcut features enable "cheating." In this work, we argue that MIA still works on LLMs, but only when multiple documents are presented for testing. We construct new benchmarks that measure the MIA performances at a continuous scale of data samples, from sentences (n-grams) to a collection of documents (multiple chunks of tokens). To validate the efficacy of current MIA approaches at greater scales, we adapt a recent work on Dataset Inference (DI) for the task of binary membership detection that aggregates paragraph-level MIA features to enable MIA at document and collection of documents level. This baseline achieves the first successful MIA on pre-trained and fine-tuned LLMs.

arXiv.org

๐Ÿค“ Want More? Check out the community page of MIA for LLMs in reserachtrend.ai https://researchtrend.ai/communities/MIALM You can see related works, the evolution of the community, and top authors!

๐Ÿ™Œ Team Credits: This research was conducted by Haritz Puerto
@mgubri @oodgnas @coallaoh with support from NAVER AI Lab.

๐Ÿ”— Links: Code and results https://github.com/parameterlab/mia-scaling Project Website: https://haritzpuerto.github.io/scaling-mia/ Paper: https://arxiv.org/pdf/2411.00154

Membership Inference for Language Models (MIALM)

Membership inference for language models investigates the ability to determine if specific data points were part of a model's training set. This research focuses on identifying vulnerabilities in models that allow adversaries to infer training data membership, developing techniques to measure and mitigate these risks.

๐ŸŒ Implications for Data Privacy: Our findings have real-world relevance for data owners worried about unauthorized use of their content in model training. It can also be used to support accountability of LLM evaluation in end-tasks.

๐Ÿ’ฌ What Do You Think? Could MIA reach a level where data owners use it as legal evidence? How might this affect LLM deployment? Let us know! #AI #LLM #NLProc

๐Ÿ”ฌ Our Testing Setup: We ran experiments using Pythia models (2.8B and 6.9B parameters) with training samples from The Pile dataset, comparing them to validation and test sets. This setup avoids data leakage to ensure a reliable evaluation of MIA.

๐Ÿ”Ž Better Results in Fine-Tuning: Fine-tuned models show even stronger MIA results. The table shows the performance at sentence level and for collections of 20 sentences, evaluated on Phi-2 fine-tuned for QA (https://huggingface.co/haritzpuerto/phi-2-dcot ).

haritzpuerto/phi-2-dcot ยท Hugging Face

Weโ€™re on a journey to advance and democratize artificial intelligence through open source and open science.

๐Ÿš€ The Key? Number of tokens & Aggregation: MIAโ€™s accuracy improves as we aggregate MIA scores across multiple paragraphs. Longer documents or larger document collections significantly boost MIA effectiveness.
๐Ÿ› ๏ธ First Success on Pre-Trained LLMs: By adapting recent work on Dataset Inference, we successfully applied MIA on pre-trained LLMs. Check out our figure below: MIA achieves an AUROC of 0.75 on documents of up to 20k tokens!
๐Ÿ” New Benchmark, New Insights: We developed a new benchmark to assess MIA effectiveness across data scales, from single sentences to document collections. This lets us identify precisely when and how MIA succeeds on LLMs.

It is assumed that Membership Inference Attacks (MIA) do not work on LLMs, but our new paper shows it can work at the right scale! MIA is effective if the number of input tokens is large enough, such as in long documents and collections of them.

๐Ÿšจ Whatโ€™s MIA? Itโ€™s a method to detect if a specific data sample was used in model training. We show that MIA works effectively on long documents (~20k tokens) and in collections of documents (>100 docs) โ€”just the scale relevant for legal applications! ๐Ÿ‘ฎ

๐Ÿšจ๐Ÿ“„ Exciting new research accepted at #NAACL2025! Discover when and at what scale we can detect if specific data was used in training LLMs โ€” a method known as Membership Inference (MIA)! Our findings open new doors for using MIA as potential legal evidence in AI. ๐Ÿงต https://arxiv.org/abs/2411.00154
Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Membership inference attacks (MIA) attempt to verify the membership of a given data sample in the training set for a model. MIA has become relevant in recent years, following the rapid development of large language models (LLM). Many are concerned about the usage of copyrighted materials for training them and call for methods for detecting such usage. However, recent research has largely concluded that current MIA methods do not work on LLMs. Even when they seem to work, it is usually because of the ill-designed experimental setup where other shortcut features enable "cheating." In this work, we argue that MIA still works on LLMs, but only when multiple documents are presented for testing. We construct new benchmarks that measure the MIA performances at a continuous scale of data samples, from sentences (n-grams) to a collection of documents (multiple chunks of tokens). To validate the efficacy of current MIA approaches at greater scales, we adapt a recent work on Dataset Inference (DI) for the task of binary membership detection that aggregates paragraph-level MIA features to enable MIA at document and collection of documents level. This baseline achieves the first successful MIA on pre-trained and fine-tuned LLMs.

arXiv.org
๐ŸŽ‰ Weโ€™re pleased to share the release of the models from our Apricot ๐Ÿ‘ paper, "Calibrating Large Language Models Using Their Generations Only", accepted at ACL 2024!
At Parameter Lab, we believe openness and reproducibility are essential for advancing science, and we've put in our best effort to ensure it.
๐Ÿค— https://huggingface.co/collections/parameterlab/apricot-models-673d2cae40b6ff437a86f0bf
๐Ÿ“„ https://aclanthology.org/2024.acl-long.824/
๐Ÿ‘ Apricot Models - a parameterlab Collection

Fine-tuned models for black-box LLM calibration, trained for "Apricot: Calibrating Large Language Models Using Their Generations Only" (ACL 2024)