"The State of Information Retrieval in 2026"

This is the best survey article I have seen in a long time in this niche.

The dominant retriever in 2026 is an 8-billion-parameter decoder-only language model fine-tuned on synthetic data, conditioned on natural-language instructions, often executing chain-of-thought reasoning before deciding what to retrieve.

https://medium.com/@mohankrishnagr08/the-state-of-information-retrieval-in-2026-192f125a5269

#research #informationRetrieval #RAG #LLM #SPLADE #AIbenchmark #AI

Tested Cogito V1 14B Qwen on my Linux server. 45 t/s, 9.7GB VRAM, and the same IDA self-awareness trick its 8B sibling pulled -- Run 2 deliberately stepped back to brute force because a beginner probably needed simpler first. Run 3 came back stronger with a nice candy analogy. That's DeepCogito's IDA training making a transformation of Qwen into something way better.

Read the full breakdown below.

#LocalAI #Ollama #HomeLabAI #LLM #AIBenchmark

https://goarcherdynamics.com/2026/04/06/aihome-cogito-v1-14b-review/?utm_source=mastodon&utm_medium=jetpack_social

AI@Home – Cogito V1 14B Review

Conditions & Context After doing a review of its little 8B brother a couple days ago, today we are looking at Cogito V1 14B model and I’m curious how it would fare in my very simple test.…

Archer Dynamics

Tested Cogito V1 8B on my Linux server. 83 t/s, 5.4GB VRAM, 131k context. The real story is where it deliberately wrote worse code because it decided a beginner needed simplicity over efficiency -- and admitted it! That's IDA self-reflection making a live call.
I guess a 5GB model with a conscience is worth more than a 70B model with none?

Read the full breakdown below.

#LocalAI #Ollama #HomeLabAI #LLM #AIBenchmark

https://goarcherdynamics.com/2026/04/03/aihome-cogito-v1-8b-review/?utm_source=mastodon&utm_medium=jetpack_social

AI@Home – Cogito V1 8B Review

Conditions & Context Today I’m looking at Cogito V1 8B model in Q4 K M quantization. This is Meta’s Llama 3.2 under the hood, but with Cogito’s proprietary self-improving IDA …

Archer Dynamics

OpenAI đã thực hiện đánh giá hiệu suất mô hình Kimi K2.5, thu hút sự chú ý từ cộng đồng AI. Dữ liệu benchmark cho thấy khả năng xử lý nhiệm vụ tiên tiến, đặc biệt trong suy luận và xử lý văn bản dài. Thông tin do người dùng d4m1n chia sẻ trên X, đang được thảo luận sôi nổi. #AI #OpenAI #KimiAI #TríTuệNhânTạo #AIbenchmark

https://www.reddit.com/r/singularity/comments/1qqba7r/openai_benchmarked_kimi_k25/

Meituan Longcat vừa ra mắt AMO Bench, bộ tiêu chuẩn đánh giá AI Toán học. Theo đó, Kimi k2 Thinking được xác định là AI tốt nhất về giải toán. AMO Bench gồm 50 bài toán mới, độ khó cấp IMO, chấm điểm tự động chính xác cao.

#AIBenchmark #MathAI #KimiK2Thinking #MeituanLongcat #TríTuệNhânTạo #ToánHọc

https://www.reddit.com/r/LocalLLaMA/comments/1p18lim/meituan_longcat_releases_amo_bench_kimi_k2/

xAI claims its new Grok 4.1 tops high‑difficulty benchmarks, showing stronger multi‑step reasoning than previous models. If you follow the race for the most capable LLMs, this update from Elon Musk’s lab is worth a look. How does it compare to other open‑source giants? Dive in for the details. #Grok41 #xAI #AIbenchmark #MultiStepReasoning

🔗 https://aidailypost.com/news/xai-says-grok-41-is-its-most-capable-model-beating-highdifficulty

OpenAI launches GDPval to measure AI performance on real-world economic tasks

https://web.brid.gy/r/https://nerds.xyz/2025/09/openai-gdpval/