S Banerjee (@SB434223)

RAG에서 임베딩 품질만으로는 충분하지 않으며, 데이터가 커질수록 검색 공간이 조밀해져 ‘거의 관련 있는’ 문서가 늘고 recall이 떨어진다는 점을 강조한다. 따라서 대규모 RAG에서는 reranking 같은 후처리와 검색 설계가 중요하다는 기술적 인사이트를 제시한다.

https://x.com/SB434223/status/2052648564321595428

#rag #embedding #reranking #retrieval #llm

S Banerjee (@SB434223) on X

@akshay_pachaar this is such an important point people miss with RAG embedding quality alone isn’t enough , retrieval becomes a density problem at scale as collection grow, semantic neighborhoods become crowded with “almost relevant” docs, and recall collapses which is why: - reranking

X (formerly Twitter)