Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Peer Review 3:

Just completed the peer project review of a project that combines semantic search, keyword fallback, and LLM reasoning to produce grounded, cited answers with a friendly Streamlit UI and a built-in monitoring dashboard as mentioned in the README!

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Peer Review 2:

Reviewed a comprehensively submitted project and got to learn the practical implementation of OpenTelemetry instrumentation with Arize Phoenix for real-time observability, trace tracking, and user feedback collection.

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Peer Review 1:

Reviewed an amazing project which incorporates:
1. Design Pattern structured modularity; very-well architectured project
2. Use of Object Relational Mapping decoupling choice of varied DB implementation.
3. The crafty decouplement of request/serving with from the ingestion pipeline with task queue enabling asynchronous processing and better scalability.

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

Completed of my RAG pipeline using:

1. BAAI/bge-base-en-v1.5 for embedding with FastEmbed
2. Cloud Qdrant as the vector database
3. OpenAI gpt-4o-mini for LLM (augment)

Now would use FastAPI for serving the service from a Streamlit app hosted at Streamlit Cloud.

Also Monitoring with Graphana is on the way!!

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

So got to know the Temperature param is really important to be set different for different functions in a RAG system.

For the main prompt to the llm with context(s) from Qdrant along with user query, it should not be more than 0.2. Well, that is what I got the best results with.

But for prompts meant for query-rewriting, best deterministic results seem to come when it set to zero.

Update on my capstone project in the #llmzoomcamp @DataTalksClub:

For any RAG system worth its salt, a follow-up question should not return a 'NOT IN CONTEXT' to the user.

For this I plan a mechanism to determine whether a question is a follow-up question. If it is, we should execute Query Re-writing with a sperate call to the LLM with
a. the user's last question
b. LLM's last response
c. user's current follow-up query

The re-rewritten query is now used to fetch chunks from Qdrant database.

Update on my capstone project in the #llmzoomcamp @DataTalksClub

Single YouTube Chapter transcript chunking ground logic-->>

1. If chapter less than 1000 words do nothing

2. if the chapter is more than 1000 words (1001-1500) , it would be broken into 2 parts with 200 words overlap between chunks.

3. if the chapter is more than 1500 words (1501-2000) , it would be broken into 3 parts with 200 words overlap between chunks.

SO ON & SO Forth with this logic. And the results work like a charm

Update on my capstone project in the #llmzoomcamp @DataTalksClub.

Chunking YouTube transcripts by Chapters for 'upsertion' to vector database is a good strategy. But what if there is only one chapter? What worked for me is chunking that single chapter with words overlap and naming each created chapter with a numbered part, e.g
{
.......,
"chapter_title" = "How to survive a Tsunami - Part 1",
...........
},

{
.......,
"chapter_title" = "How to survive a Tsunami - Part 2",
........
},

Started working on the RAG pipeline for my capstone project in the #llmzoomcamp @DataTalksClub.

In this phase, I’ll be building the knowledge base for my RAG system using YouTube transcript data collected through the ingestion pipeline.

Key steps:
1️⃣ Semantic chunking of transcript chapters
2️⃣ Generating embeddings from the chunks
3️⃣ Upserting embeddings into Qdrant vector DB

Completed the ingestion pipeline for my capstone project for the #llmzoomcamp @DataTalksClub. The ingestion pipeline creates a content catalog from the YouTube playlists input in the properties file. The manual transcripts of the videos mentioned in the catalog are extracted and for those videos where no manual transcripts are available, the audio tracks are downloaded and using Automation Speech Recognition (ASR) on those audio tracks, transcripts are formed. Pipeline orchestrator is Prefect.