Im Interview spricht sie über ihren Weg in die Forschung sowie darüber, wie vielfältige Perspektiven Wissenschaft und #KünstlicheIntelligenz bereichern: https://blog.tib.eu/2026/06/25/frauen-in-der-wissenschaft-dr-jennifer-dsouza/
Топ вопросов по LLM: стратегии генерации текста и метрики оценки LLM
На NLP/LLM-собеседованиях часто проверяют не то, знаешь ли ты слова top-k, top-p и BLEU, а понимаешь ли ты, что происходит с распределением вероятностей, почему greedy decoding зацикливается, зачем нужна temperature и почему BLEU плохо оценивает ответы современных LLM. В этой статье - чеклист по языковому моделированию, стратегиям генерации и метрикам качества. Это не полноценная лекция с нуля, а тренажёр, по которому стоит пройтись перед техническим интервью по NLP, чтобы закрыть пробелы и вспомнить необходимую базу.
https://habr.com/ru/articles/1044418/
#машинное_обучение #искусственный_интеллект #naturallanguageprocessing #deeplearning #large_language_model #data_science
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
https://arxiv.org/abs/2309.12288
#HackerNews #ReversalCurse #LLMs #MachineLearning #AIResearch #NaturalLanguageProcessing

We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A". This is the Reversal Curse. For instance, if a model is trained on "Valentina Tereshkova was the first woman to travel to space", it will not automatically be able to answer the question, "Who was the first woman to travel to space?". Moreover, the likelihood of the correct answer ("Valentina Tershkova") will not be higher than for a random name. Thus, models do not generalize a prevalent pattern in their training set: if "A is B" occurs, "B is A" is more likely to occur. It is worth noting, however, that if "A is B" appears in-context, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as "Uriah Hawthorne is the composer of Abyssal Melodies" and showing that they fail to correctly answer "Who composed Abyssal Melodies?". The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation. We also evaluate ChatGPT (GPT-3.5 and GPT-4) on questions about real-world celebrities, such as "Who is Tom Cruise's mother? [A: Mary Lee Pfeiffer]" and the reverse "Who is Mary Lee Pfeiffer's son?". GPT-4 correctly answers questions like the former 79% of the time, compared to 33% for the latter. Code available at: https://github.com/lukasberglund/reversal_curse.
RE: https://mathstodon.xyz/@xameer/116744860546236676
statistical #naturallanguageprocessing processing method has been applied to automatically predict the outcome of cases tried by the European Court of Human Rights (violation or no violation of a specific article) based on their textual contents, reaching a prediction accuracy of 79%.[24] A subsequent qualitative analysis of these results provided some support towards the theory of legal realism. The authors write: "In general, and notwithstanding the simplified snapshot of a very complex debate that we just presented, our results could be understood as lending some support to the basic legal realist intuition according to which judges are primarily responsive to non-legal, rather than to legal, reasons when they decide hard cases."
#humanrights
OpenAI Upgrades GPT-5.5 Model with Improved Accuracy and Conversational Style
OpenAI has upgraded its GPT-5.5 model with a major update, boosting accuracy and conversational style to make interactions feel more human and natural. The new version promises more readable and engaging responses, with a focus on practical help tasks and a more conversational tone.
#Gpt55 #ArtificialIntelligence #ConversationalAi #EmergingTechnologies #NaturalLanguageProcessing
CS336: Language Modeling from Scratch
#HackerNews #CS336 #LanguageModeling #StanfordAI #MachineLearning #NaturalLanguageProcessing #TechEducation
Prompt Politeness Affects LLM Accuracy
https://arxiv.org/abs/2510.04950
#HackerNews #PromptPoliteness #LLMAccuracy #AIResearch #NaturalLanguageProcessing #MachineLearning

The wording of natural language prompts has been shown to influence the performance of large language models (LLMs), yet the role of politeness and tone remains underexplored. In this study, we investigate how varying levels of prompt politeness affect model accuracy on multiple-choice questions. We created a dataset of 50 base questions spanning mathematics, science, and history, each rewritten into five tone variants: Very Polite, Polite, Neutral, Rude, and Very Rude, yielding 250 unique prompts. Using ChatGPT 4o, we evaluated responses across these conditions and applied paired sample t-tests to assess statistical significance. Contrary to expectations, impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite prompts to 84.8% for Very Rude prompts. These findings differ from earlier studies that associated rudeness with poorer outcomes, suggesting that newer LLMs may respond differently to tonal variation. Our results highlight the importance of studying pragmatic aspects of prompting and raise broader questions about the social dimensions of human-AI interaction.
How Can We Prevent AI Models From Cannibalizing Themselves When Human-Generated Data Runs Out?
Getty Images While the evolution of artificial intelligence (AI) systems has shown no sign of slowing, there's a growing concern that large language models (LLMs) will soon run out of human-made data to ingest and learn from. Once this happens, scientists say, AI models will increasingly rely on synthetic AI-made information, which will lead to an effect called "model collapse."......Continue reading... By: Roland Moore-Colyer Source: Live Science . Critics: A backdoor in a […]
Getty Images While the evolution of artificial intelligence (AI) systems has shown no sign of slowing, there’s a growing concern that large language models (LLMs) will soon run out of human-m…
I'd like to introduce #Emily, an #OpenSource #InformationRetrieval system I've been working on in my spare time.