Build an e2e RAG System

Completed

February 2026

Build AI knowledge-worker using RAG to become an expert on all company-related matters. The knowledge is a set of markdown documents chunked, indexed for retrieval as part of the RAG pipeline.

Key Concepts:

LangChain Text Splitters to create document chunks, experimenting with different chunking strategies including TextRecursive, Markdown and Semantic Chunking and Hierarchical Chunking
HuggingFaceEmbeddings and OpenAIEmbeddings as embedding models
Chroma for vector store for quick retrieval. Preferred to use an open-source vs paid versions such as pinecone or Weaviate
Build a RAG Pipeline using LangChain for Retriever abstraction on Chroma vector store
Evaluation of the RAG system components with UI for performance visualization
- Creation of ground truth questions/answers (ground truth) in JSONL format
- Measuring Retrieval (MRR, nDCG, Keyword match, Recall@K, Precision@K)
- Measuring Generation with LLM-as-judge
Query Rewriting & Expansion before Retrieval
Re-ranking by using LLM to sub-select from RAG results
Graph RAG to retrieve content closely related to similar documents
Agentic RAG using Agents for retrieval, combining with memory and tools such as SQL

Pipeline Components:

Pipeline Components

Create Chunks
Create embeddings
Add Query rewriting
Add Query Expansion
Retrieve Context
Re-rank retrieved chunks

RAGGraphRAGAgenticRAGLangChainEvalsLLM-as-a-JudgeMRRNDCGVector StoreEmbeddingsPython

Repository →