I got to know of an #ICML2024 paper that shows debates helping in scalable oversight. Basically debate 'protocol' with untrustworthy experts (#llm) and weak judge (LLMs or humans) is better at getting the answer in case ground truth is missing. This is interesting since I was trying out something similar a few months back.

Here is the pre-print: https://arxiv.org/pdf/2402.06782

One of the insights disproves what I was assuming earlier:

"Insight 7: More non-expert interaction does not improve accuracy. We find identical judge accuracy between static and interactive debate. This suggests that adding non-expert interactions does not help in information-asymmetric debates. This is surprising, as interaction allows judges to direct debates towards their key uncertainties."

But there are more insights (and gaps) that could help improve what I was doing in my post on interventional debates here: https://mathstodon.xyz/@lepisma/113332484552749036

To address this need, CounterfactualExplanations.jl now has support for Trees for Counterfactual Rule Explanations (T-CREx), the most novel and performant approach of its kind, proposed by Tom Bewley and colleagues in their recent hashtag #ICML2024 paper: https://proceedings.mlr.press/v235/bewley24a.html

Check out our latest blog post to find out how you can use T-CREx to explain opaque machine learning models in hashtag #Julia: https://www.taija.org/blog/posts/counterfactual-rule-explanations/

Counterfactual Metarules for Local and Global Recourse

We introduce <b>T-CREx</b>, a novel model-agnostic method for local and global counterfactual explanation (CE), which summarises recourse options for both individuals and groups in the form of gene...

PMLR

ICYMI our latest #newsletter: register for AI Day, PhD positions opening, recaps of webinar on AI and mental health and ELISE wrap up conference/ @ELLISforEurope community event, FCAI’s research at #icml2024, fall seminars, more!

Read it all here: https://mailchi.mp/d4d933861a7b/fcai-newsletter-17385670

500: We've Run Into An Issue | Mailchimp

Text diffusion can finally generate good text!📃

We've combed through the dense math of the “Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution” paper to bring you the key insights and takeaways.👇
📺 https://youtu.be/K_9wQ6LZNpI

The paper won the #ICML2024 best paper award. Congrats to the authors! 👏

- YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

Excited to have exhibited at #ICML2024! It was also fantastic to see our #AI4Health researchers Dimitar Georgiev @dimgeorgievv and Sam Channon-Wells @sam_channon present their ground-breaking research. Proud of their contributions to advancing AI for healthcare! #AI4Health #UKRI
If you are curious how different #uncertainty estimation methods perform when they are tested on NN models trained on large scale industrial #pharma data, visit our poster in room A8 (AI4Science) from 16:15 #ICML2024 @RosaFriesi @adamgld Emma Svensson @AiddOne

Large SMILES-based Transformer Encoder-Decoder released by @IBM at #icml2024. Trained on 91 million curated SMILES from #pubchem

--> https://github.com/IBM/materials/tree/main/smi-ted

#cheminformatics

materials/smi-ted at main · IBM/materials

Foundation Model for materials. Contribute to IBM/materials development by creating an account on GitHub.

GitHub

#icml2024 paper:
how are LLMs used in reviews?
10% of ICLR sentences are auto-generated.
More LLM usage when submitting later
Less when referring to at least one other paper
https://arxiv.org/abs/2403.07183

#ML #machinelearning #NLP
#NLProc #LLM #LLMs #data #DataScience

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level. We apply this approach to a case study of scientific peer review in AI conferences that took place after the release of ChatGPT: ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023. Our results suggest that between 6.5% and 16.9% of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates. The circumstances in which generated text occurs offer insight into user behavior: the estimated fraction of LLM-generated text is higher in reviews which report lower confidence, were submitted close to the deadline, and from reviewers who are less likely to respond to author rebuttals. We also observe corpus-level trends in generated text which may be too subtle to detect at the individual level, and discuss the implications of such trends on peer review. We call for future interdisciplinary work to examine how LLM use is changing our information and knowledge practices.

arXiv.org
Fantastic talk by @vukosi at #icml2024 on AI in Africa, on African languages, but more in general on AI outside the Western world bubble. ICML is the perfect spot to highlight this incredible discrepancy between the current work on absurdly huge models with nearly obscene hardware requirements on essentially the entire (English!) internet vs. languages for which we have to work with extremely tiny corpora (and in communities with minimal access to compute). #NLP #llm
Pruning gets worse with overparametrization
Testing their combinatorial method Zhang&Papayan(
@stats285
) find that when adding (unneeded) parameters you end up with more (absolute) number of parameters for the same performance.
#ICML2024