Github Awesome (@GithubAwesome)

AI 에이전트용 코드 검색 도구 Semble을 소개. 전체 레포를 263ms에 인덱싱하고 1.5ms에 질의 응답하며, GPU나 API 키 없이 CPU만으로 동작한다. 1.37억 파라미터 트랜스포머 모델 대비 99% 수준의 검색 품질을 주장해, 에이전트의 코드베이스 탐색·RAG 워크플로에 실용성이 높아 보인다.

https://x.com/GithubAwesome/status/2056194964464304591

#codesearch #agents #retrieval #cpu #rag

Github Awesome (@GithubAwesome) on X

Semble is code search built for AI agents, and the speed numbers are hard to ignore. It indexes a full repo in 263 milliseconds and answers queries in 1.5 milliseconds, all on CPU, no GPU, no API keys. Retrieval quality sits at 99% of a 137M-parameter transformer model. Run it as

X (formerly Twitter)

A weighty thematic issue of Information Research has just dropped: 'Artificial Intelligence (AI) in Information Science'. The issue includes 44 papers exploring information seeking in the age of #AI, #information evaluation and use, information #retrieval, trust and security, future research needs, and a lot more. It'll take a while to read them all, but read I must!

https://publicera.kb.se/ir/issue/view/5559 #InformationResearch #InformationScience #InformationRetrieval #LLMs #ArtificialIntelligence

Vol. 31 No. 2 (2026): Information Research: Artificial Intelligence (AI) in Information Science | Information Research an international electronic journal

🙈Ah yes, another riveting tale of #academic buzzword-bingo, where "rethinking retrieval" is code for "we need something to publish" and "direct corpus interaction" sounds like a euphemism for an awkward office party. 🤖💡 But don't worry, because soon we'll be 'agentic searching' for the meaning of life, if only we could comprehend what any of this actually means. 🙃
https://arxiv.org/abs/2605.05242 #jargon #buzzword #bingo #retrieval #humor #tech #satire #HackerNews #ngated
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction

Modern retrieval systems, whether lexical or semantic, expose a corpus through a fixed similarity interface that compresses access into a single top-k retrieval step before reasoning. This abstraction is efficient, but for agentic search, it becomes a bottleneck: exact lexical constraints, sparse clue conjunctions, local context checks, and multi-step hypothesis refinement are difficult to implement by calling a conventional off-the-shelf retriever, and evidence filtered out early cannot be recovered by stronger downstream reasoning. Agentic tasks further exacerbate this limitation because they require agents to orchestrate multiple steps, including discovering intermediate entities, combining weak clues, and revising the plan after observing partial evidence. To tackle the limitation, we study direct corpus interaction (DCI), where an agent searches the raw corpus directly with general-purpose terminal tools (e.g., grep, file reads, shell commands, lightweight scripts), without any embedding model, vector index, or retrieval API. This approach requires no offline indexing and adapts naturally to evolving local corpora. Across IR benchmarks and end-to-end agentic search tasks, this simple setup substantially outperforms strong sparse, dense, and reranking baselines on several BRIGHT and BEIR datasets, and attains strong accuracy on BrowseComp-Plus and multi-hop QA without relying on any conventional semantic retriever. Our results indicate that as language agents become stronger, retrieval quality depends not only on reasoning ability but also on the resolution of the interface through which the model interacts with the corpus, with which DCI opens a broader interface-design space for agentic search.

arXiv.org

S Banerjee (@SB434223)

RAG에서 임베딩 품질만으로는 충분하지 않으며, 데이터가 커질수록 검색 공간이 조밀해져 ‘거의 관련 있는’ 문서가 늘고 recall이 떨어진다는 점을 강조한다. 따라서 대규모 RAG에서는 reranking 같은 후처리와 검색 설계가 중요하다는 기술적 인사이트를 제시한다.

https://x.com/SB434223/status/2052648564321595428

#rag #embedding #reranking #retrieval #llm

S Banerjee (@SB434223) on X

@akshay_pachaar this is such an important point people miss with RAG embedding quality alone isn’t enough , retrieval becomes a density problem at scale as collection grow, semantic neighborhoods become crowded with “almost relevant” docs, and recall collapses which is why: - reranking

X (formerly Twitter)

Akshay (@akshay_pachaar)

RAG 시스템의 검색 성능이 5천 개 문서에서는 90%였지만 50만 개 문서로 확장하자 50%로 급락하는 사례를 제시하며, 동일한 임베딩 모델과 리트리버를 써도 문서 규모 증가가 성능 저하를 유발할 수 있음을 짚는다. 대규모 RAG 설계의 핵심 문제를 묻는 LLM 인터뷰 질문이다.

https://x.com/akshay_pachaar/status/2052371239520629243

#rag #llm #embeddings #retrieval #nlp

Akshay 🚀 (@akshay_pachaar) on X

A tricky LLM interview question: Your RAG system scores 90% retrieval accuracy on 5k company docs. But scaling to 500k docs drops the accuracy to just 50%, with the same embedding model and retriever. Why did this happen? The simplest answer is that more documents mean more

X (formerly Twitter)

Saeed Anwar (@saen_dev)

대부분의 프로덕션 시스템이 아직도 긴 문맥을 한 번에 넣기보다 청킹과 검색을 쓰는 이유로 quadratic attention 비용을 지적한다. Ring attention과 sparse variant가 도움은 되지만, 10만 토큰 이하에서는 추가되는 엔지니어링 복잡성이 대체로 가치가 낮다고 설명한다.

https://x.com/saen_dev/status/2050120696395678001

#attention #llm #retrieval #sparse #engineering

Saeed Anwar (@saen_dev) on X

@_avichawla The quadratic attention cost is why most production systems still chunk and retrieve instead of stuffing everything into context. Ring attention and sparse variants help but the engineering complexity they add is rarely worth it below 100k tokens.

X (formerly Twitter)

Akshay (@akshay_pachaar)

벡터 DB는 단일 쿼리와 개별 청크 유사도 검색에는 적합하지만, 여러 청크의 정보를 종합해야 하는 질문에서는 한계가 있다고 지적한다. FalkorDB의 GraphRAG-Bench 결과를 근거로, GraphRAG 방식이 이런 다중 홉 추론 문제에서 격차를 드러낸다고 설명한다.

https://x.com/akshay_pachaar/status/2049445928788963433

#vectordb #graphrag #graphragbench #retrieval #llm

Akshay 🚀 (@akshay_pachaar) on X

Vector DBs can't reason. Top-k similarity ranks chunks one at a time against a query. That's fine for single-hop fact lookups, and it breaks the moment a question needs information stitched across multiple chunks. That's what the FalkorDB GraphRAG-Bench results expose. The gap

X (formerly Twitter)

Rohan Paul (@rohanpaul_ai)

AI 메모리 개선에 관한 유익한 논문을 언급하며, 현대 AI에는 오래 유지되는 가중치 기억, 최신 정보를 위한 검색 기억, 목표·선호·경험을 위한 에이전트 기억의 3가지 메모리 시스템이 필요하다고 설명한다.

https://x.com/rohanpaul_ai/status/2049099963012194477

#aimemory #research #llm #retrieval #agents

Rohan Paul (@rohanpaul_ai) on X

Great survey paper on better AI memory. Modern AI needs three different memory systems: weights for slow, durable knowledge, retrieval for fresh and specific facts, and agent memory for ongoing goals, preferences, and experience. A model with only parametric memory is

X (formerly Twitter)

This is a handy list for comparing the features of vector databases (holy mole there are a lot of them), including year of launch, opensource-ness, licences, and implementation language: https://superlinked.com/vector-db-comparison

#vectors #embeddings #search #retrieval #rag #genai #agents

Vector Database Comparison | Superlinked

Compare 47+ vector databases across features, performance, and adoption. Filter by license, languages, index types. Data sourced from VectorHub.

fly51fly (@fly51fly)

온디바이스 검색증강생성(RAG)을 위한 통합 모델과 문서 표현 방법을 제안한 연구입니다. 기기 내에서 더 효율적으로 문서를 검색·활용해 RAG를 구현하는 방향으로, 엣지 AI와 프라이버시 측면에서 중요한 기술입니다.

https://x.com/fly51fly/status/2045256160404738149

#rag #ondevice #retrieval #edgeai #llm

fly51fly (@fly51fly) on X

[IR] A Unified Model and Document Representation for On-Device Retrieval-Augmented Generation J Killingback, O Meshi, H Li, H Zamani… [University of Massachusetts Amherst & Google] (2026) https://t.co/WVmtLvwU5u

X (formerly Twitter)