- https://github.com/afrancois-dev/llm-zoomcamp-2026
#llmzoomcamp
Update on my capstone project in the #llmzoomcamp @DataTalksClub:
Peer Review 3:
Just completed the peer project review of a project that combines semantic search, keyword fallback, and LLM reasoning to produce grounded, cited answers with a friendly Streamlit UI and a built-in monitoring dashboard as mentioned in the README!
Update on my capstone project in the #llmzoomcamp @DataTalksClub:
Peer Review 2:
Reviewed a comprehensively submitted project and got to learn the practical implementation of OpenTelemetry instrumentation with Arize Phoenix for real-time observability, trace tracking, and user feedback collection.
Update on my capstone project in the #llmzoomcamp @DataTalksClub:
Peer Review 1:
Reviewed an amazing project which incorporates:
1. Design Pattern structured modularity; very-well architectured project
2. Use of Object Relational Mapping decoupling choice of varied DB implementation.
3. The crafty decouplement of request/serving with from the ingestion pipeline with task queue enabling asynchronous processing and better scalability.
Update on my capstone project in the #llmzoomcamp @DataTalksClub:
Completed of my RAG pipeline using:
1. BAAI/bge-base-en-v1.5 for embedding with FastEmbed
2. Cloud Qdrant as the vector database
3. OpenAI gpt-4o-mini for LLM (augment)
Now would use FastAPI for serving the service from a Streamlit app hosted at Streamlit Cloud.
Also Monitoring with Graphana is on the way!!
Update on my capstone project in the #llmzoomcamp @DataTalksClub:
So got to know the Temperature param is really important to be set different for different functions in a RAG system.
For the main prompt to the llm with context(s) from Qdrant along with user query, it should not be more than 0.2. Well, that is what I got the best results with.
But for prompts meant for query-rewriting, best deterministic results seem to come when it set to zero.
Update on my capstone project in the #llmzoomcamp @DataTalksClub:
For any RAG system worth its salt, a follow-up question should not return a 'NOT IN CONTEXT' to the user.
For this I plan a mechanism to determine whether a question is a follow-up question. If it is, we should execute Query Re-writing with a sperate call to the LLM with
a. the user's last question
b. LLM's last response
c. user's current follow-up query
The re-rewritten query is now used to fetch chunks from Qdrant database.
Update on my capstone project in the #llmzoomcamp @DataTalksClub
Single YouTube Chapter transcript chunking ground logic-->>
1. If chapter less than 1000 words do nothing
2. if the chapter is more than 1000 words (1001-1500) , it would be broken into 2 parts with 200 words overlap between chunks.
3. if the chapter is more than 1500 words (1501-2000) , it would be broken into 3 parts with 200 words overlap between chunks.
SO ON & SO Forth with this logic. And the results work like a charm
Update on my capstone project in the #llmzoomcamp @DataTalksClub.
Chunking YouTube transcripts by Chapters for 'upsertion' to vector database is a good strategy. But what if there is only one chapter? What worked for me is chunking that single chapter with words overlap and naming each created chapter with a numbered part, e.g
{
.......,
"chapter_title" = "How to survive a Tsunami - Part 1",
...........
},
{
.......,
"chapter_title" = "How to survive a Tsunami - Part 2",
........
},