Convergent Evolution: How Different Language Models Learn Similar Number Representations

Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: while Transformers, Linear RNNs, LSTMs, and classical word embeddings trained in different ways all learn features that have period-$T$ spikes in the Fourier domain, only some learn geometrically separable features that can be used to linearly classify a number mod-$T$. To explain this incongruity, we prove that Fourier domain sparsity is necessary but not sufficient for mod-$T$ geometric separability. Empirically, we investigate when model training yields geometrically separable features, finding that the data, architecture, optimizer, and tokenizer all play key roles. In particular, we identify two different routes through which models can acquire geometrically separable features: they can learn them from complementary co-occurrence signals in general language data, including text-number co-occurrence and cross-number interaction, or from multi-token (but not single-token) addition problems. Overall, our results highlight the phenomenon of convergent evolution in feature learning: A diverse range of models learn similar features from different training signals.

arXiv.org

Prompt Injection Attacks Target AI Systems with Alarming Frequency

Imagine a simple question that can outsmart a secret-keeping system - it's happening more often than you'd think, as prompt injection attacks use cleverly crafted language to trick AI models into spilling their secrets. By manipulating conversational inputs, these attacks can get supposedly secure AI bots to…

https://osintsights.com/prompt-injection-attacks-target-ai-systems-with-alarming-frequency?utm_source=mastodon&utm_medium=social

#ArtificialIntelligence #PromptInjectionAttacks #EmergingThreats #LanguageModels #AiSecurity

Prompt Injection Attacks Target AI Systems with Alarming Frequency

Learn how prompt injection attacks exploit AI systems, revealing secrets with alarming frequency, and find out how to protect your AI models now effectively.

OSINTSights

fly51fly (@fly51fly)

Parcae는 안정적인 루프형 언어모델에 대한 스케일링 법칙을 제시합니다. 모델 크기와 성능, 안정성의 관계를 정량화해 새로운 아키텍처 설계와 대규모 언어모델 최적화에 참고할 만한 연구입니다.

https://x.com/fly51fly/status/2045611901988790574

#scalinglaws #languagemodels #loopedmodels #stability #llm

Researchers compared 22 language models and 102 human participants on three verbal creativity tasks to examine how AI influences creative thinking. Individual models produced original responses, but across models outputs were markedly more similar than human responses, indicating homogenization among AI-generated ideas. Adjusting randomness and prompts partially increased originality but did not resolve the uniformity, suggesting potential limits to AI-assisted brainstorming.

By contrasting human divergent and associative thinking with AI outputs on standard creativity tasks, the study highlights cognitive diversity, originality, and how training data shapes thought. This topic interests psychology because it probes how creativity emerges, the role of context and tools in thought, and the implications for human problem solving when using AI as a brainstorming aid.

Article Title: Scientists tested the creativity of AI models, and the results were surprisingly homogeneous

Link to PsyPost Article: https://www.psypost dot org/scientists-tested-the-creativity-of-ai-models-and-the-results-were-surprisingly-homogeneous/

Copy and paste broken link above into your browser and replace "dot" with "." for link to work. We have to do it this way to avoid displaying copyrighted images.

#AIcreative
#LanguageModels
#CreativityResearch
#DivergentThinking
#CognitiveFlexibility

Liquid City Motors • Apollo Vermouth • Donnie Echo • Language Models @ Cactus Club - 16 Apr feat. Liquid City Motors, Language Models

#SESH #LiquidCityMotors #LanguageModels

https://sesh.sx/e/1968308

Liquid City Motors • Apollo Vermouth • Donnie Echo • Language Models | Cactus Club | SESH

...

SESH
GitHub - arman-bd/guppylm: A ~9M parameter LLM that talks like a small fish.

A ~9M parameter LLM that talks like a small fish. Contribute to arman-bd/guppylm development by creating an account on GitHub.

GitHub
Emotion concepts and their function in a large language model

Interpretability research from Anthropic on emotion concepts

fly51fly (@fly51fly)

대형 언어모델이 가르치는 과정에서 상대방의 정신 상태를 추론하는지(mentalize) 분석한 2026년 논문이 공개되었습니다. 인간의 teaching 상호작용을 모사하는 LLM의 인지적 행동을 다뤄, 모델 해석과 인간유사 추론 능력 연구에 의미가 있습니다.

https://x.com/fly51fly/status/2040187062683582641

#llm #mentalization #airesearch #languagemodels #arxiv

fly51fly (@fly51fly) on X

[AI] Do Large Language Models Mentalize When They Teach? S K. Harootonian, M K. Ho, T L. Griffiths, Y Niv… [Princeton University & New York University] (2026) https://t.co/VsoXwbYsAf

X (formerly Twitter)

Implicator.ai released the AI Top 40, a weekly ranking that combines 10 benchmarks into one score per language model. The system weights contamination-resistant tests like SWE-bench 4x higher than Chatbot Arena. GPT-5.4 currently leads despite Claude topping Arena rankings. Updates every Saturday and offers free embedding for websites.

#AIBenchmarks #LanguageModels #AIEvaluation

https://www.implicator.ai/implicator-ai-launches-the-ai-top-40-ranking-llms-across-10-benchmarks-in-one-score/

AI Top 40 Launches, Ranking LLMs Across 10 Benchmarks

The AI Top 40 ranks language models by aggregating 10 benchmarks into one score. GPT-5.4 leads despite Claude topping Arena, because the system weights rigorous tests four times higher.

Implicator.ai
A new study explores how human confidence in large language models (LLMs) often surpasses their actual accuracy. It highlights the 'calibration gap' - the difference between what LLMs know and what users think they know.

Read Full Article

#LanguageModels #AIConfidence #CalibrationGap #MachineLearning #DataAccuracy https://doi.org/10.1038/s42256-024-00976-7
Reenviado desde Science News
(https://t.me/experienciainterdimensional/10502)