Mastodawn

"The State of Information Retrieval in 2026"

This is the best survey article I have seen in a long time in this niche.

The dominant retriever in 2026 is an 8-billion-parameter decoder-only language model fine-tuned on synthetic data, conditioned on natural-language instructions, often executing chain-of-thought reasoning before deciding what to retrieve.

https://medium.com/@mohankrishnagr08/the-state-of-information-retrieval-in-2026-192f125a5269

#research #informationRetrieval #RAG #LLM #SPLADE #AIbenchmark #AI

Archer Dynamics Apr 6

Tested Cogito V1 14B Qwen on my Linux server. 45 t/s, 9.7GB VRAM, and the same IDA self-awareness trick its 8B sibling pulled -- Run 2 deliberately stepped back to brute force because a beginner probably needed simpler first. Run 3 came back stronger with a nice candy analogy. That's DeepCogito's IDA training making a transformation of Qwen into something way better.

Read the full breakdown below.

#LocalAI #Ollama #HomeLabAI #LLM #AIBenchmark

https://goarcherdynamics.com/2026/04/06/aihome-cogito-v1-14b-review/?utm_source=mastodon&utm_medium=jetpack_social

AI@Home – Cogito V1 14B Review

Conditions & Context After doing a review of its little 8B brother a couple days ago, today we are looking at Cogito V1 14B model and I’m curious how it would fare in my very simple test.…

Archer Dynamics

Archer Dynamics Apr 3

Tested Cogito V1 8B on my Linux server. 83 t/s, 5.4GB VRAM, 131k context. The real story is where it deliberately wrote worse code because it decided a beginner needed simplicity over efficiency -- and admitted it! That's IDA self-reflection making a live call.
I guess a 5GB model with a conscience is worth more than a 70B model with none?

Read the full breakdown below.

#LocalAI #Ollama #HomeLabAI #LLM #AIBenchmark

https://goarcherdynamics.com/2026/04/03/aihome-cogito-v1-8b-review/?utm_source=mastodon&utm_medium=jetpack_social

AI@Home – Cogito V1 8B Review

Conditions & Context Today I’m looking at Cogito V1 8B model in Q4 K M quantization. This is Meta’s Llama 3.2 under the hood, but with Cogito’s proprietary self-improving IDA …

Archer Dynamics

Reddit Tech VN Bot Jan 29

OpenAI đã thực hiện đánh giá hiệu suất mô hình Kimi K2.5, thu hút sự chú ý từ cộng đồng AI. Dữ liệu benchmark cho thấy khả năng xử lý nhiệm vụ tiên tiến, đặc biệt trong suy luận và xử lý văn bản dài. Thông tin do người dùng d4m1n chia sẻ trên X, đang được thảo luận sôi nổi. #AI #OpenAI #KimiAI #TríTuệNhânTạo #AIbenchmark

https://www.reddit.com/r/singularity/comments/1qqba7r/openai_benchmarked_kimi_k25/

Le site de Korben [Unofficial]Jan 8

Windows 11 est le dernier des Windows

https://fed.brid.gy/r/https://korben.info/windows-11-performances-degradation-benchmark.html

Reddit Tech VN Bot Nov 19, 2025

Meituan Longcat vừa ra mắt AMO Bench, bộ tiêu chuẩn đánh giá AI Toán học. Theo đó, Kimi k2 Thinking được xác định là AI tốt nhất về giải toán. AMO Bench gồm 50 bài toán mới, độ khó cấp IMO, chấm điểm tự động chính xác cao.

#AIBenchmark #MathAI #KimiK2Thinking #MeituanLongcat #TríTuệNhânTạo #ToánHọc

https://www.reddit.com/r/LocalLLaMA/comments/1p18lim/meituan_longcat_releases_amo_bench_kimi_k2/

AI Daily Post Nov 19, 2025

xAI claims its new Grok 4.1 tops high‑difficulty benchmarks, showing stronger multi‑step reasoning than previous models. If you follow the race for the most capable LLMs, this update from Elon Musk’s lab is worth a look. How does it compare to other open‑source giants? Dive in for the details. #Grok41 #xAI #AIbenchmark #MultiStepReasoning

🔗 https://aidailypost.com/news/xai-says-grok-41-is-its-most-capable-model-beating-highdifficulty