fly51fly (@fly51fly)
논문 'Consistency of Large Reasoning Models Under Multi-Turn Attacks' 발표(Y Li, R Krishnan, R Padman, CMU, 2026). 다중 턴 공격 상황에서 대형 추론 모델의 일관성(consistency) 문제를 분석·보고하는 연구 논문으로, 모델의 공격 내성 및 안정성 관련 인사이트를 제공합니다(원문 링크 포함).
fly51fly (@fly51fly)
논문 'Consistency of Large Reasoning Models Under Multi-Turn Attacks' 발표(Y Li, R Krishnan, R Padman, CMU, 2026). 다중 턴 공격 상황에서 대형 추론 모델의 일관성(consistency) 문제를 분석·보고하는 연구 논문으로, 모델의 공격 내성 및 안정성 관련 인사이트를 제공합니다(원문 링크 포함).
xAI’s co‑founder exits keep coming, while Lambda outlines a 2025 shift toward bigger context windows, multimodal reasoning models and open‑source inference for AI production. What could this mean for the future of machine learning? Read on for the full story. #AIProduction #ReasoningModels #MultimodalAI #OpenSourceInference
🔗 https://aidailypost.com/news/xai-co-founder-departures-persist-lambda-outlines-2025-ai-production
AI that thinks instead of guessing?
Reasoning models use techniques like chain of thought and tree of thought to decompose problems, explore alternatives, and choose better answers, often at the cost of more compute and latency.
A practical explainer:
🔗 https://techglimmer.io/what-is-ai-thinking-reasoning-models/
#AI #ReasoningModels #ChainOfThought #TreeOfThought #GenAI #FediTech #MachineLearning
Manning Publications (@ManningBooks)
추론(reasoning) 모델의 중요성이 장기적으로 큰 변화를 가져온다는 내용입니다. Meta 등 기업들이 추론 모델을 밀고 있으며 VentureBeat가 MobileLLM-R1을 언급했고, @rasbt의 Build를 통해 추론 모델이 실제로 어떻게 구축되고 평가되는지 배울 수 있다는 점을 강조합니다.

AI moves fast, but some shifts matter long after the headlines pass. Reasoning models are one of 'em. As it grows, even companies like @Meta are pushing them, as @VentureBeat highlights with MobileLLM-R1. Want to learn how they're are actually built & evaluated? @rasbt's Build
OpenAI: GPT-5 Thinking Models Are The Most "Monitarable" Models To Date
#AI #OpenAI #AISafety #LLM #MachineLearning #GPT5 #DeepMind #AIResearch #ChainOfThought #Monitorability #AIAlignment #ReasoningModels
FINE-TUNING Qwen3 VỚI "THINKING MODE" KHÓ KHĂN TRONG LẬP LUẬN. Tài liệu hướng dẫn tạo tập dữ liệu "giải thích" (thinking) chưa rõ ràng khiến việc huấn luyện mô hình gặp trục trặc. Ai có kinh nghiệm hoặc tài liệu về kiến thức này chia sẻ giúp #AI #MachineLearning #LậpLý #MôHìnhQwen #ReasoningModels #KnowledgeInjection
*(Tóm tắt: Người dùng gặp khó khăn khi tinh chỉnh Qwen3 để bổ sung kiến thức Vật lý nhờ "thinking mode". Cố tạo dữ liệu giải thích bằng Qwen3 dẫn đến hiệu suất giảm. Cần chia sẻ
New AI reasoning models built as neural networks are showing striking convergence across diverse training sets. Researchers say this hints at emergent structure in how machines learn to reason, opening fresh avenues for open‑source computational tools. Dive into the findings and see why this could reshape our approach to artificial intelligence. #AI #NeuralNetworks #ReasoningModels #Convergence
🔗 https://aidailypost.com/news/new-ai-reasoning-models-built-neural-networks-show-striking
Reasoning Models Reason Well, Until They Don't
https://arxiv.org/abs/2510.22371
#HackerNews #ReasoningModels #ReasonWell #AIResearch #MachineLearning #HackerNews
Large language models (LLMs) have shown significant progress in reasoning tasks. However, recent studies show that transformers and LLMs fail catastrophically once reasoning problems exceed modest complexity. We revisit these findings through the lens of large reasoning models (LRMs) -- LLMs fine-tuned with incentives for step-by-step argumentation and self-verification. LRM performance on graph and reasoning benchmarks such as NLGraph seem extraordinary, with some even claiming they are capable of generalized reasoning and innovation in reasoning-intensive fields such as mathematics, physics, medicine, and law. However, by more carefully scaling the complexity of reasoning problems, we show existing benchmarks actually have limited complexity. We develop a new dataset, the Deep Reasoning Dataset (DeepRD), along with a generative process for producing unlimited examples of scalable complexity. We use this dataset to evaluate model performance on graph connectivity and natural language proof planning. We find that the performance of LRMs drop abruptly at sufficient complexity and do not generalize. We also relate our LRM results to the distributions of the complexities of large, real-world knowledge graphs, interaction graphs, and proof datasets. We find the majority of real-world examples fall inside the LRMs' success regime, yet the long tails expose substantial failure potential. Our analysis highlights the near-term utility of LRMs while underscoring the need for new methods that generalize beyond the complexity of examples in the training distribution.
AI Models Table
This webpage has reference info on a lot of AI models.