FastRAG đã cập nhật tính năng "Chat với Website" (V1.3)! Giờ đây, bạn có thể trò chuyện với nội dung từ các trang web (như Wikipedia, tài liệu) bên cạnh các tệp PDF. Ứng dụng dùng Cheerio để quét web nhanh hơn và chia nhỏ văn bản để giữ ngữ cảnh tốt hơn.

#FastRAG #RAGapp #WebScraping #AI #TechUpdate #Cheerio #LangChain
#TròChuyệnVớiWebsite #ỨngDụngAI #CôngNghệMới #PhátTriểnAI

https://www.reddit.com/r/SaaS/comments/1pbdh09/chat_with_website_functionality_to_my_rag_app/

I tend to both Like and Rewoot things. It makes me feel old for some reason. "Ah yes, marvelous post. Here is your heart emoji tip and I will be sure to write a glowing review in the paper."


#tip-tip #cheerio #I'm-not-even-british
Happy Sunday from a Cheerio 
#corgisofmastodon #corgi #cheerio
#corgisofmastodon #corgi #cheerio
Good morning from a cheerio 

While working on #pdf4anki integrated the feature of automatically adding opinionated #CSS with #cheerio to the initial #HTML file. This had the unknown side effect of destroying all the self-closing HTML tags, thus breaking the pattern matching.

The fun of taking apart the code to #debug this issue.

As neat as #jquery or #cheerio is, I miss the abilities of #VanillaJavaScript in the browser.

I don't remember how many times I tried to grab certain properties, which would have been available in the browser, but don't exist in cheerio.

And it is a bit annoying to constantly put various html elements into the cheerio wrapper class to get access to the various functionalities it offers. Thus instead grabbed the minimal viable data and just worked further with arrays.

#javascript

After struggling to get #python #PyMuPDF to work and being close the deadline, I shifted to using a combination of other commands.

First using the #linux #pdftohtml command, which is so much faster than PyMuPDF and packages the result similar to saving a website.

Next with #NeoVim and #RegEx format the #HTML file to be able to be quickly processed with #NodeJs #cheerio and eventually through #json to be saved in #sqlite.

Is it elegant and automatic? No, though it works!

#JavaScript

Thanks to everyone who has contributed to the #Cheerio HTML/XML manipulation library. I've been using it for over a decade but just noticed it just reached the 1.0 milestone (after seven years of release candidates).
https://cheerio.js.org/blog/cheerio-1.0
Cheerio 1.0 Released, Batteries Included 🔋 | cheerio

Cheerio 1.0 is out! After 12 release candidates and just a short seven years