📢 Apache Lucene 10.4.0 is out!
Many lucene queries should see a performance improvement of 10-15%, some might even see a 35% improvement!
Additionally, there is a new scalar quantized format for dense vectors and knn search.
#Lucene sits at the heart of so many #search platforms, including the @OpenSearchProject, that stand to benefit from this release.
Congrats to all the contributors to the release 👏
Check out the release blog by the Lucene PMC: https://lucene.apache.org/core/corenews.html#apache-lucenetm-1040-available
Deep Research without Deep Pockets.
I pulled apart the premium “Deep Research” tools to see what they actually do, then built the same workflow locally without needing a huge model or huge spend.
The trick: make the pipeline do the hard work (search + reduce + evidence), so the LLM mostly just writes.
Part 5 of the DocSummarizer series: https://www.mostlylucid.net/blog/doomsummarizer-deep-research
What’s your best technique for reducing “model made it up” without just throwing a bigger model at it?
#rag #llm #deepresearch #ai #llm #lucene
The upcoming #OpenSearchCon Europe will have a designated "Search & Apache Lucene" track.
If you're involved in Apache #Lucene project, or the broader search and relevancy ecosystem, I encourage you to consider submitting a talk proposal to share your experience.
The conference will take place 16-17 April in Prague.
The CFP is open until 18th January: https://events.linuxfoundation.org/opensearchcon-europe/program/cfp/
Das ist im wesentlichen #elasticsearch und das ist eine Verpackung aka UI um #lucene
@parttimenerd That's an interesting approach, thanks a lot for sharing!
I also toyed with a similar idea a while back: https://binjr.eu/blog/2023/08/new-data-adapter-jdk-flight-recorder/
With that said, there are some differences in the approach I took over the one you discussed in your post.
For one, I opted to use an inverted index (#Lucene) instead of a relational DB as my backend, which comes with it's own trade-offs, like offering a query language that is somewhat easier to use, but not as nearly as powerful.
The other main difference, is that the route I used to get there is kinda like the opposite from the one you took: while you went from the backend working your way up to the UI, I very much started there (as I already had it) and worked my way down.
Doing things this way around meant that I could benefit immediately from the UI features that were there already (which was the whole point, of course) but it makes integrating new ones that don't fit so naturally with the rest of the tool, much more time consuming...
At any rate, I would love to hear your thoughts if you find the time to give it a try!
(you can get it here: https://github.com/binjr/binjr/releases)
🚀 Introducing Lucene-on-Faiss
⚡ 2x boost in search throughput
💡 Decrease memory limitations
#OpenSearch #VectorSearch #Lucene #Faiss #AI #GenerativeAI #ANN #SearchTech
SQL vs NoSQL: Выбор подходящей базы данных для вашего проекта
Одним из самых фундаментальных и критически важных решений при создании современного приложения является выбор технологии для хранения данных.
#DST #DSTGlobal #ДСТ #ДСТГлобал #DSTplatform #ДСТПлатформ #базаданных #SQL #NoSQL #РСУБД #СУБД #PostgreSQL #Redis #MongoDB #JSON #BJSON #WordPress #Drupal #DLE #BigData #Oracle #Database #Microsoft #SQLServer #ACID #Cassandra #Elasticsearch #Apache #Lucene
Источник: https://dstglobal.ru/club/1101-sql-vs-nosql-vybor-podhodjaschei-bazy-dannyh-dlja-vashego-proekta