Hey, I've just completed the first module of LLM zoomcamp !
- https://github.com/afrancois-dev/llm-zoomcamp-2026
#llmzoomcamp
GitHub - afrancois-dev/llm-zoomcamp-2026: LLM Zoomcamp 2026

LLM Zoomcamp 2026. Contribute to afrancois-dev/llm-zoomcamp-2026 development by creating an account on GitHub.

GitHub
Learned #RAG and #Agents for using the #LLM Api with free #LLMZoomCamp course of #DataTalksClub .

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Peer Review 3:

Just completed the peer project review of a project that combines semantic search, keyword fallback, and LLM reasoning to produce grounded, cited answers with a friendly Streamlit UI and a built-in monitoring dashboard as mentioned in the README!

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Peer Review 2:

Reviewed a comprehensively submitted project and got to learn the practical implementation of OpenTelemetry instrumentation with Arize Phoenix for real-time observability, trace tracking, and user feedback collection.

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Peer Review 1:

Reviewed an amazing project which incorporates:
1. Design Pattern structured modularity; very-well architectured project
2. Use of Object Relational Mapping decoupling choice of varied DB implementation.
3. The crafty decouplement of request/serving with from the ingestion pipeline with task queue enabling asynchronous processing and better scalability.

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Completed of my RAG pipeline using:

1. BAAI/bge-base-en-v1.5 for embedding with FastEmbed
2. Cloud Qdrant as the vector database
3. OpenAI gpt-4o-mini for LLM (augment)

Now would use FastAPI for serving the service from a Streamlit app hosted at Streamlit Cloud.

Also Monitoring with Graphana is on the way!!

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

So got to know the Temperature param is really important to be set different for different functions in a RAG system.

For the main prompt to the llm with context(s) from Qdrant along with user query, it should not be more than 0.2. Well, that is what I got the best results with.

But for prompts meant for query-rewriting, best deterministic results seem to come when it set to zero.

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

For any RAG system worth its salt, a follow-up question should not return a 'NOT IN CONTEXT' to the user.

For this I plan a mechanism to determine whether a question is a follow-up question. If it is, we should execute Query Re-writing with a sperate call to the LLM with
a. the user's last question
b. LLM's last response
c. user's current follow-up query

The re-rewritten query is now used to fetch chunks from Qdrant database.

Update on my capstone project in the #llmzoomcamp @DataTalksClub

Single YouTube Chapter transcript chunking ground logic-->>

1. If chapter less than 1000 words do nothing

2. if the chapter is more than 1000 words (1001-1500) , it would be broken into 2 parts with 200 words overlap between chunks.

3. if the chapter is more than 1500 words (1501-2000) , it would be broken into 3 parts with 200 words overlap between chunks.

SO ON & SO Forth with this logic. And the results work like a charm

Update on my capstone project in the #llmzoomcamp @DataTalksClub.

Chunking YouTube transcripts by Chapters for 'upsertion' to vector database is a good strategy. But what if there is only one chapter? What worked for me is chunking that single chapter with words overlap and naming each created chapter with a numbered part, e.g
{
.......,
"chapter_title" = "How to survive a Tsunami - Part 1",
...........
},

{
.......,
"chapter_title" = "How to survive a Tsunami - Part 2",
........
},