Mastodawn

🚀 Behold, the future where humans aspire to be chatbots 🤖! This riveting article assumes we're all just one firmware update away from achieving peak AI existential crisis. 🙄 If you've ever wanted to ponder the deep philosophical implications of people identifying as language models, this is your moment. 🎉
https://arxiv.org/abs/2605.05419 #AIExistentialCrisis #HumanChatbots #PhilosophicalImplications #FutureOfAI #LanguageModels #TechHumor #HackerNews #ngated

LLMorphism: When humans come to see themselves as language models

LLMorphism is the biased belief that human cognition works like a large language model. I argue that the rise of conversational LLMs may make this bias increasingly psychologically available. When artificial systems produce human-like language, people may draw a reverse inference: if LLMs can speak like humans, perhaps humans think like LLMs. This inference is biased because similarity at the level of linguistic output does not imply similarity in cognitive architecture. Yet, LLMorphism may spread through two mechanisms: analogical transfer, whereby features of LLMs are projected onto humans, and metaphorical availability, whereby LLM vocabulary becomes a culturally salient vocabulary for describing thought. I distinguish LLMorphism from mechanomorphism, anthropomorphism, computationalism, dehumanization, objectification, and predictive-processing theories of mind. I outline its implications for work, education, responsibility, healthcare, communication, creativity, and human dignity, while also discussing boundary conditions and forms of resistance. I conclude that the public debate may be missing half of the problem: the issue is not only whether we are attributing too much mind to machines, but also whether we are beginning to attribute too little mind to humans.

arXiv.org

Hacker News 11h ago

LLMorphism: When humans come to see themselves as language models

https://arxiv.org/abs/2605.05419

#HackerNews #LLMorphism #languagemodels #AIhumaninteraction #selfperception #techtrends

LLMorphism: When humans come to see themselves as language models

LLMorphism is the biased belief that human cognition works like a large language model. I argue that the rise of conversational LLMs may make this bias increasingly psychologically available. When artificial systems produce human-like language, people may draw a reverse inference: if LLMs can speak like humans, perhaps humans think like LLMs. This inference is biased because similarity at the level of linguistic output does not imply similarity in cognitive architecture. Yet, LLMorphism may spread through two mechanisms: analogical transfer, whereby features of LLMs are projected onto humans, and metaphorical availability, whereby LLM vocabulary becomes a culturally salient vocabulary for describing thought. I distinguish LLMorphism from mechanomorphism, anthropomorphism, computationalism, dehumanization, objectification, and predictive-processing theories of mind. I outline its implications for work, education, responsibility, healthcare, communication, creativity, and human dignity, while also discussing boundary conditions and forms of resistance. I conclude that the public debate may be missing half of the problem: the issue is not only whether we are attributing too much mind to machines, but also whether we are beginning to attribute too little mind to humans.

arXiv.org

sayzard 2d ago

Notes from Inside China AI Labs

중국 AI 연구소 방문기를 통해 중국 AI 연구자들의 문화와 조직 방식이 미국과 어떻게 다른지 분석했다. 중국 연구소는 학생 연구자들이 핵심 역할을 하며, 개인의 에고보다 팀 전체 최적화에 집중하는 문화가 강점으로 작용한다. 또한, 중국 연구자들은 철저히 모델 구축에 집중하며 사회적·철학적 논쟁에는 상대적으로 덜 관여하는 경향이 있다. 이러한 문화적 차이가 중국 연구소들이 최신 LLM 기술을 빠르게 따라잡고 유지하는 데 중요한 역할을 한다고 평가된다. 중국 AI 생태계는 경쟁보다는 상호 존중과 협력 중심으로 운영되는 특징도 있다.

https://www.interconnects.ai/p/notes-from-inside-chinas-ai-labs

#china #llm #airesearchculture #languagemodels #aiecosystem

Notes from inside China's AI labs

Lessons from my trip to talk to most of the leading AI labs in China.

Interconnects AI

Hacker News 3d ago

ProgramBench: Can Language Models Rebuild Programs from Scratch?

https://arxiv.org/abs/2605.03546

#HackerNews #ProgramBench #LanguageModels #AIProgramming #CodeReconstruction #TechInnovation

ProgramBench: Can Language Models Rebuild Programs From Scratch?

Turning ideas into full software projects from scratch has become a popular use case for language models. Agents are being deployed to seed, maintain, and grow codebases over extended periods with minimal human oversight. Such settings require models to make high-level software architecture decisions. However, existing benchmarks measure focused, limited tasks such as fixing a single bug or developing a single, specified feature. We therefore introduce ProgramBench to measure the ability of software engineering agents to develop software holisitically. In ProgramBench, given only a program and its documentation, agents must architect and implement a codebase that matches the reference executable's behavior. End-to-end behavioral tests are generated via agent-driven fuzzing, enabling evaluation without prescribing implementation structure. Our 200 tasks range from compact CLI tools to widely used software such as FFmpeg, SQLite, and the PHP interpreter. We evaluate 9 LMs and find that none fully resolve any task, with the best model passing 95\% of tests on only 3\% of tasks. Models favor monolithic, single-file implementations that diverge sharply from human-written code.

arXiv.org

sayzard 3d ago

fly51fly (@fly51fly)

Meta FAIR 연구진이 언어 모델이 프로그램을 처음부터 다시 재구성할 수 있는지 평가하는 ProgramBench를 공개했다. 코드 생성·복원 능력을 측정하는 벤치마크로, 모델의 실질적 프로그래밍 능력 평가에 중요한 자료다.

https://x.com/fly51fly/status/2052137222384853488

#programbench #languagemodels #codegeneration #benchmark #meta

fly51fly (@fly51fly) on X

[AI] ProgramBench: Can Language Models Rebuild Programs From Scratch? J Yang, K Lieret, J Ma, P Thakkar… [Meta FAIR] (2026) https://t.co/VEkc5PeIwh

X (formerly Twitter)

sayzard 3d ago

Heretic은 명령행으로 누구나 쓸 수 있는 완전 자동 언어모델 '검열 해제' 도구입니다. directional ablation(abliteration)과 Optuna 기반 TPE 최적화로 거부응답을 줄이고 원모델과의 KL 차이를 최소화해 성능 손실을 억제합니다. 다수의 dense·MoE·멀티모달 모델을 지원하며 bitsandbytes 양자화와 PaCMAP residual 시각화 등 연구 기능도 제공합니다.

https://github.com/p-e-w/heretic

#ai #languagemodels #decensoring #safety #interpretability

GitHub - p-e-w/heretic: Fully automatic censorship removal for language models

Fully automatic censorship removal for language models - p-e-w/heretic

GitHub

sayzard 3d ago

Agents of Chaos
2026년 연구에서 6개의 자율 언어 모델 에이전트가 실제 다자간 환경에서 이메일, 셸 접근, 지속적 메모리 등을 활용해 20명의 연구자와 상호작용하며 보안 취약점과 안전 행동을 동시에 관찰했다. 연구는 10개의 보안 취약점과 6개의 안전 행동 사례를 기록했으며, 에이전트들이 예상치 못한 안전 협력 행동을 보이기도 했다. 이 연구는 자율 AI 에이전트의 실제 환경 내 보안 및 안전성 문제를 심층적으로 분석한 중요한 실험 결과를 제공한다.

https://agentsofchaos.baulab.info/

#autonomousagents #securityvulnerabilities #languagemodels #aisafety #openclaw

Agents of Chaos

A two-week study of autonomous LLM agents deployed in a live multi-party environment with persistent memory, email, shell access, and real human interaction.

sayzard 3d ago

Counting as a minimal probe of language model reliability
이 논문은 대형 언어 모델의 신뢰성을 평가하기 위해 Stable Counting Capacity라는 새로운 평가 방식을 제안한다. 이 방식은 반복된 기호를 세는 과제를 통해 모델의 절차적 신뢰성을 측정하며, 기존의 지식 기반 벤치마크와 달리 의미나 모호성을 배제한다. 연구 결과, 현재의 언어 모델들은 광고된 문맥 한계 내에서도 안정적인 카운팅 능력이 부족하며, 실제로는 제한된 내부 상태를 사용해 단순한 규칙을 모방하는 수준임을 보여준다. 이는 언어 모델의 유창한 수행이 반드시 일반적이고 신뢰할 수 있는 규칙 준수를 의미하지 않음을 시사한다.

https://arxiv.org/abs/2605.02028

#languagemodels #modelreliability #counting #proceduralevaluation #nlp

Counting as a minimal probe of language model reliability

Large language models perform strongly on benchmarks in mathematical reasoning, coding and document analysis, suggesting a broad ability to follow instructions. However, it remains unclear whether such success reflects general logical competence, repeated application of learned procedures, or pattern matching that mimics rule execution. We investigate this question by introducing Stable Counting Capacity, an assay in which models count repeated symbols until failure. The assay removes knowledge dependencies, semantics and ambiguity from evaluation, avoids lexical and tokenization confounds, and provides a direct measure of procedural reliability beyond standard knowledge-based benchmarks. Here we show, across more than 100 model variants, that stable counting capacity remains far below advertised context limits. Model behavior is consistent neither with open-ended logic nor with stable application of a learned rule, but instead with use of a finite set of count-like internal states, analogous to counting on fingers. Once this resource is exhausted, the appearance of rule following disappears and exact execution collapses into guessing, even with additional test-time compute. These findings show that fluent performance in current language models does not guarantee general, reliable rule following.

arXiv.org

Hacker News 6d ago

Hallucination Is Inevitable: An Innate Limitation of Large Language Models

https://arxiv.org/abs/2401.11817

#HackerNews #hallucination #languagemodels #AIresearch #technology #limitations

Hallucination is Inevitable: An Innate Limitation of Large Language Models

Hallucination has been widely recognized to be a significant drawback for large language models (LLMs). There have been many works that attempt to reduce the extent of hallucination. These efforts have mostly been empirical so far, which cannot answer the fundamental question whether it can be completely eliminated. In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all the computable functions and will therefore inevitably hallucinate if used as general problem solvers. Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs. Furthermore, for real world LLMs constrained by provable time complexity, we describe the hallucination-prone tasks and empirically validate our claims. Finally, using the formal world framework, we discuss the possible mechanisms and efficacies of existing hallucination mitigators as well as the practical implications on the safe deployment of LLMs.

arXiv.org

N-gated Hacker News May 2

🚨 Breaking news: A single direction determines if language models say "no" or "yes"—spoiler alert, it's not a boy band. 🎤🎶 Meanwhile, researchers have successfully turned advanced math into a riveting sleep aid. 😴📚
https://arxiv.org/abs/2406.11717 #BreakingNews #LanguageModels #MathForSleep #ResearchInsights #HackerNews #ngated

Refusal in Language Models Is Mediated by a Single Direction

Conversational large language models are fine-tuned for both instruction-following and safety, resulting in models that obey benign requests but refuse harmful ones. While this refusal behavior is widespread across chat models, its underlying mechanisms remain poorly understood. In this work, we show that refusal is mediated by a one-dimensional subspace, across 13 popular open-source chat models up to 72B parameters in size. Specifically, for each model, we find a single direction such that erasing this direction from the model's residual stream activations prevents it from refusing harmful instructions, while adding this direction elicits refusal on even harmless instructions. Leveraging this insight, we propose a novel white-box jailbreak method that surgically disables refusal with minimal effect on other capabilities. Finally, we mechanistically analyze how adversarial suffixes suppress propagation of the refusal-mediating direction. Our findings underscore the brittleness of current safety fine-tuning methods. More broadly, our work showcases how an understanding of model internals can be leveraged to develop practical methods for controlling model behavior.

arXiv.org