Mastodawn

Pim Martens Feb 20, 2024

#Scientivism: You have to be willing to make an effort to reach a wider audience with your research. As a keynote speaker of the People and Planet conference in Lahti, Finland, I was interviewed by a newspaper about the role of scientivism within the field of Planetary Health. Then, I jumped into an ice hole (and quickly got out again!).

https://www.ess.fi/paikalliset/6562637

Tunnettu hollantilainen tiedemies meni ensimmäistä kertaa avantoon Lahdessa – "Minulla on velvollisuus olla myös aktivisti"

Lahdessa järjestetty People and planet -konferenssi kokosi kaupunkiin pari sataa planetaarisen terveyden asiantuntijaa.

Etelä-Suomen Sanomat

Leshem Choshen Dec 28, 2023

#LLMs "have the potential of playing an important role in [...]opinion formation in online social media"🫡🤖
Not surprising. But also not encouraging

Is it a potential we even want to research? under what terms?
Thoughts?
https://arxiv.org/abs/2312.15523

#NLProc #scientivism #ethics #ml #machinelearning #chatgpt

The Persuasive Power of Large Language Models

The increasing capability of Large Language Models to act as human-like social agents raises two important questions in the area of opinion dynamics. First, whether these agents can generate effective arguments that could be injected into the online discourse to steer the public opinion. Second, whether artificial agents can interact with each other to reproduce dynamics of persuasion typical of human social systems, opening up opportunities for studying synthetic social systems as faithful proxies for opinion dynamics in human populations. To address these questions, we designed a synthetic persuasion dialogue scenario on the topic of climate change, where a 'convincer' agent generates a persuasive argument for a 'skeptic' agent, who subsequently assesses whether the argument changed its internal opinion state. Different types of arguments were generated to incorporate different linguistic dimensions underpinning psycho-linguistic theories of opinion change. We then asked human judges to evaluate the persuasiveness of machine-generated arguments. Arguments that included factual knowledge, markers of trust, expressions of support, and conveyed status were deemed most effective according to both humans and agents, with humans reporting a marked preference for knowledge-based arguments. Our experimental framework lays the groundwork for future in-silico studies of opinion dynamics, and our findings suggest that artificial agents have the potential of playing an important role in collective processes of opinion formation in online social media.

arXiv.org

Leshem Choshen Dec 27, 2023

Solar mixes two base-model copies
to create a larger one
Then train it a bit more and beat other open models out there.
How? and my thoughts 🧵

(no author with a handle?!)
https://arxiv.org/abs/2312.15166
#scientivism #LLM #LLMS #pretraining #MIXTRAL #SOLAR #machinelearning #ml #NLP

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. Inspired by recent efforts to efficiently up-scale LLMs, we present a method for scaling LLMs called depth up-scaling (DUS), which encompasses depthwise scaling and continued pretraining. In contrast to other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently. We show experimentally that DUS is simple yet effective in scaling up high-performance LLMs from small ones. Building on the DUS model, we additionally present SOLAR 10.7B-Instruct, a variant fine-tuned for instruction-following capabilities, surpassing Mixtral-8x7B-Instruct. SOLAR 10.7B is publicly available under the Apache 2.0 license, promoting broad access and application in the LLM field.

arXiv.org

Leshem Choshen Nov 21, 2023

So many warn that evaluating with GPT favors GPT

(or any LLM evaluating itself).

Now it is also shown

Science, not just educated guesses

(Fig: T5, GPT, Bart each prefer their own) https://arxiv.org/abs/2311.09766

#enough2skim #scientivism #NLP #nlproc #GPT #LLM #eval #data

LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores

Automatic evaluation of generated textual content presents an ongoing challenge within the field of NLP. Given the impressive capabilities of modern language models (LMs) across diverse NLP tasks, there is a growing trend to employ these models in creating innovative evaluation metrics for automated assessment of generation tasks. This paper investigates a pivotal question: Do language model-driven evaluation metrics inherently exhibit bias favoring texts generated by the same underlying language model? Specifically, we assess whether prominent LM-based evaluation metrics (e.g. BARTScore, T5Score, and GPTScore) demonstrate a favorable bias toward their respective underlying LMs in the context of summarization tasks. Our findings unveil a latent bias, particularly pronounced when such evaluation metrics are used in a reference-free manner without leveraging gold summaries. These results underscore that assessments provided by generative evaluation models can be influenced by factors beyond the inherent text quality, highlighting the necessity of developing more reliable evaluation protocols in the future.

arXiv.org

Leshem Choshen Mar 8, 2023

You know what?
I will stop sharing any LLM "news" if they don't share with me first (models or code)

#thereOrIDontCare
#scientivism
#uShareFirst

Thanks delip rao for inspiration
And the new vision and language that did open unlike XXXX-e
https://twitter.com/DrJimFan/status/1633179734803890177?t=VaQS4E56eEq55NHKN3y5hg&s=19
#machinelearning #cv #nlproc #nlp

Jim Fan on Twitter

“After ChatGPT, the future belongs to multimodal LLMs. What’s even better? Open-sourcing. Announcing Prismer, my team’s latest vision-language AI, empowered by domain-expert models in depth, surface normal, segmentation, etc. No paywall. No forms. https://t.co/LV76hKH1PY…”

Twitter

Leshem Choshen Feb 8, 2023

So often we are reminded that good work goes unnoticed
I share others' papers to change that
What else could we do?
What mechanisms better allow propagation by value rather than by fame?
Is there something we can do to make science better?
https://blog.samaltman.com/you-and-your-research

#scientivism #ScienceMastodon #PR #NLProc #machinelearning #CV

You and Your Research, by Richard Hamming

Richard Hamming gave this talk in March of 1986. [1] It's one of the best talks I've ever read and has long impacted how I think about spending my time. I mentioned it to a number of people this...

Sam Altman

Leshem Choshen Jan 17, 2023

A surprising take on why we should open LLMs:
otherwise empirical research would suffocate and
rule-based (nativist) would return

Not sure I am buying it or even that it is dreadful, but more the reason to share and hear opinions
https://arxiv.org/abs/2301.05272
Patrick Perrine
#LLM #NLP #nlproc #machinelearning #ML #scientivism

Inaccessible Neural Language Models Could Reinvigorate Linguistic Nativism

Large Language Models (LLMs) have been making big waves in the machine learning community within the past few years. The impressive scalability of LLMs due to the advent of deep learning can be seen as a continuation of empiricist lingusitic methods, as opposed to rule-based linguistic methods that are grounded in a nativist perspective. Current LLMs are generally inaccessible to resource-constrained researchers, due to a variety of factors including closed source code. This work argues that this lack of accessibility could instill a nativist bias in researchers new to computational linguistics, given that new researchers may only have rule-based, nativist approaches to study to produce new work. Also, given that there are numerous critics of deep learning claiming that LLMs and related methods may soon lose their relevancy, we speculate that such an event could trigger a new wave of nativism in the language processing community. To prevent such a dramatic shift and placing favor in hybrid methods of rules and deep learning, we call upon researchers to open source their LLM code wherever possible to allow both empircist and hybrid approaches to remain accessible.

arXiv.org

Leshem Choshen Dec 5, 2022

We want to pretrain🤞
Instead we finetune🚮😔
Could we collaborate?🤗

ColD Fusion:
🔄Recycle finetuning to multitask
➡️evolve pretrained models forever

On 35 datasets
+2% improvement over RoBERTa
+7% in few shot settings
🧵

#NLProc #MachinLearning #NLP #ML #modelRecyclying #collaborativeAI #scientivism #pretrain

Leshem Choshen Nov 17, 2022

🔖Reviewing has so many faults📖
Finally, there is a dataset of reviews, edits and everything else!

5 venues 5K papers 11K reviews
Enjoy!

https://arxiv.org/abs/2211.06651
Nils Dycke, Ilia Kuznetsov, Iryna Gurevych

#NLProc #review #CV #machinelearning #scientivism

NLPeer: A Unified Resource for the Computational Study of Peer Review

Peer review is a core component of scholarly publishing, yet it is time-consuming, requires considerable expertise, and is prone to error. The applications of NLP for peer reviewing assistance aim to mitigate those issues, but the lack of clearly licensed datasets and multi-domain corpora prevent the systematic study of NLP for peer review. To remedy this, we introduce NLPeer -- the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues. In addition to the new datasets of paper drafts, camera-ready versions and peer reviews from the NLP community, we establish a unified data representation, and augment previous peer review datasets to include parsed, structured paper representations, rich metadata and versioning information. Our work paves the path towards systematic, multi-faceted, evidence-based study of peer review in NLP and beyond. We make NLPeer publicly available.

arXiv.org

Leshem Choshen Nov 16, 2022

Are findings as good as ACL?

years since the first findings papers were introduced
since chris manning & ani nenkova called for a yearly analysis
since they were first done

Who's game for the yearly analysis?

https://twitter.com/chrmanning/status/1451261089644089394

For earlier analysis and code (old, not on , next year links from here?)

https://twitter.com/gneubig/status/1451317437392113665?s=20&t=B0RYiSShHJQPpITBWbKF8g
https://twitter.com/ryandcotterell/status/1451551344012181514?s=20&t=bURi5XYeaZS93-GtCWW23Q
https://twitter.com/wilkeraziz/status/1451896682321485824?s=20&t=bURi5XYeaZS93-GtCWW23Q
https://twitter.com/sanxing_chen/status/1451325907918934019?s=20&t=cQ8RTuwN1plau9mrLtExXw
#NLProc #findings #scientivism #ACL

Christopher Manning on Twitter

“Anyone game to help me out by doing this data analysis? 🙏 I do think it’d be interesting to understand the distribution of citations between EMNLP 2020 and Findings of EMNLP 2020. There might even be a t-shirt in it or something….”

Twitter