Mastodawn

Sortie de la version 7.0 de Datafari, moteur de recherche open source « intelligent » pour entreprise https://linuxfr.org/news/sortie-de-la-version-7-0-de-datafari-moteur-de-recherche-open-source-intelligent-pour-entreprise #intelligence_artificielle #moteur_de_recherche #connaissance #datafari #apache2 #Java #java #solr

Sortie de la version 7.0 de Datafari, moteur de recherche open source « intelligent » pour entreprise - LinuxFr.org

L’actualité du logiciel libre et des sujets voisins (DIY, Open Hardware, Open Data, les Communs, etc.), sur un site francophone contributif géré par une équipe bénévole par et pour des libristes enthousiastes

Show thread

gary May 24

@drmorrisj the monetization aspect is based on their licensing which is excellent plus the value prop for concrete search engine results which are internal and custom is great ip that can add to business workflows and continuity, that is my concrete, non abstract take on yacy and why people may want to use it, also makes for good rag pipeline and structured data, json, they have a nice api #open source value leader #solr dump #nutch

TYPO3 Videos Mar 11

Building Software That Welcomes Everyone - Marc HaunschildPimp My Solr Search - Chatting With the New Solr Vector Database - Olivier DobberkauHow to expose a...#TYPO3 #TYPO3campVenlo #WebCampVenlo #wcv #softwaredevelopment #cms #php #opensource #solr #vectordatabase #developer #figma
Friday - Building Software - Solr Vector Db - developer experience - Figma - Web Camp Venlo 2026

Friday - Building Software - Solr Vector Db - developer experience - Figma - Web Camp Venlo 2026

YouTube

Dotan Horovits #CNCFAmbassador Mar 5

Apache Solr next major release 10.0.0 is out.
Congrats to the maintainers and all involved 👏
https://solr.apache.org/news.html#apache-solrtm-1000-available

#opensource #search #solr @TheASF

Solr News

You may also read these news as an ATOM feed.

Show thread

klokanek Feb 23

@Earl @gary_alderson

That's the incompatibility of internal #solr
Usually, with the major version number increase, the solr is upgraded. And solr is able to upgrade only one version up, not two. So you can upgrade #YaCy, for example, from 1.93 to 1.94, not 1.96.
Exporting and reimporting the whole index is a way how to cross this limitation, but can take time and disk space, depending on the size of index.
See: https://eldar.cz/yacydoc/dev/solr.html#upgrading
and
https://eldar.cz/yacydoc/operation/index-export-import.html

Solr and YaCy Integration - YaCy Docs

Drupal Odyssey Feb 11

📚 New Blog Post: Building an 'X-Ray Machine' for Drupal 11
I just published Part 3 of my "Automated Librarian" series. Today is all about moving from "thin metadata" to full deep-text indexing using Solr and Tika.
If you've ever struggled with making thousands of PDFs truly searchable in #Drupal, this one is for you.
Read the full breakdown: https://drupalodyssey.com/go/rOxKwcy
#Drupal11 #OpenSource #Solr #SearchAPI #WebDevelopment #PHP

The Automated Librarian: Part 3 - Indexing PDF Content with Solr & Tika in Drupal 11

Stop searching for filenames and start searching inside your data. Learn how to use Apache Solr and Tika to index PDF content in Drupal 11, configure weighted search boosts, and unlock the "Black Box" of your Media Library.

Drupal Odyssey

Drupal Odyssey Jan 28

I've been a digital hoarder of eBooks for two decades. It was time to stop collecting and start discovering.
Part 1 of my new series on building an intelligent library with #Drupal and #Solr is live.

📖 Read the breakdown: https://drupalodyssey.com/go/WRJQM

#OpenSource #Drupal11 #DevLog #PHP #HomeLab

The Automated Librarian: Part 1 - Architecting an Intelligent eBook Library in Drupal 11

Tired of manual data entry? See how I built an "Automated Librarian" in Drupal 11. This series explores using Migrations, Open Library, and Ollama to turn raw files into an AI-summarized, full-text searchable discovery engine.

Drupal Odyssey

Show thread

klokanek Jan 26

@benjamin_e @orbiterlab

Trying for news search engine as well, using #YaCy and https://eldar.cz/news/ aggregator. Relevancy while search is not great. The pseudo-pagerank ("citation rank") doesn't work that much and is so heavy for computation that I switched that off:
https://community.searchlab.eu/t/how-to-activate-and-rank-by-cr-citation-rank/1733/5

Vector search would certainly be a big help. #solr already have that, but not implemented in YaCy so far.

For distinguishing news sites, I just use "collections" feature. see https://community.searchlab.eu/t/what-became-of-yacys-gsa-interface-collection-feature/621/7

well... news @přehled zpráv - čerstvé zprávy co 15 minut

Přehled nejaktuálnějších zpráv z důvěryhodnějších českých, slovenských a světových médií na jedné stránce. Aktualizováno každých 15 minut. RSS agregátor.

gary Jan 12

there is a ton of info here but i spider and it indexes so you can find what you are looking for quicker - this is about 7gb #lib archive #index #nutch #solr #common crawl

Show thread

gary Dec 18

@kajer you need pipeline but also real time sentiment ranking, checksums for all files seen, cve, ioc - all of a sudden you went from stodgy siem to real world noc

i like the topic even though i do look at it sarcastically

i think you want combinatorials of top 10 db and real time #mentions I think you are going to get people in federated enclaves to join together and work on problems but also make a typical image - everybody can optimize and get vbetter by sharing there will be a representative manifest of sw that will vary by sector #thomas register ocr scan and convert into semantic/vectordb/graphs #rss #solr #keywords #page rank...it reminds me of the site that correlates ip to domains and more - great osint info if found to be true - how do you advertise and make the site have rev streams - you cab advertise to people trolling the sector - you need a template bots and spiders, get all the info you can and cache sites and then get it into a db - the real time part is basically a tagcloud #i ching #book of changes #backlinks

Sortie de la version 7.0 de Datafari, moteur de recherche open source « intelligent » pour entreprise - LinuxFr.org

Friday - Building Software - Solr Vector Db - developer experience - Figma - Web Camp Venlo 2026

Solr News

Solr and YaCy Integration - YaCy Docs

The Automated Librarian: Part 3 - Indexing PDF Content with Solr & Tika in Drupal 11

The Automated Librarian: Part 1 - Architecting an Intelligent eBook Library in Drupal 11

well... news @přehled zpráv - čerstvé zprávy co 15 minut

Sortie de la version 7.0 de Datafari, moteur de recherche open source « intelligent » pour entreprise - LinuxFr.org