This isn’t a fake “upload a PDF” button. It’s a real document-chat pipeline. Users upload documents, the app indexes them, and they get answers with citations. Here’s how it all works.
What Is RAG?
RAG stands for Retrieval-Augmented Generation. In plain English:- Your app stores source material (like PDFs)
- When a user asks a question, the app finds the most relevant pieces
- Those pieces get passed into the model as context
- The model answers using that context
- How It Works
- Setup
The Pipeline, Step by Step
Text extraction
The repo extracts readable text from the PDF so the AI system has something to work with.
Chunking
The text is split into smaller sections. Models and vector search work much better on focused chunks than one giant blob.
Embedding
Each chunk is turned into a vector embedding using OpenAI’s embedding model. Think of it as converting meaning into numbers.
Storage
Those embeddings are stored in PostgreSQL in the
embeddings table, alongside document metadata.Retrieval
When the user asks a question, the question is embedded too. The app runs similarity search to find the closest matching chunks.
Prompt injection
The best chunks are assembled into prompt-ready context and passed into the chat generation flow.
The Vector Database
The vector database is what makes semantic search possible. Instead of just storing raw text, the app stores:- The original chunk of text
- Metadata (document ID, page number, etc.)
- A vector embedding for that chunk
pgvector is a Postgres extension enabled at the database level. Your host must support CREATE EXTENSION vector for this to work.What You Can Build With This
This setup is perfect for:- PDF question answering
- Internal knowledge assistants
- Customer support bots grounded in your docs
- Contract or policy lookup
- Document-aware research workflows
Common Mistakes
Uploading PDFs but never indexing them
Uploading PDFs but never indexing them
The upload and indexing are separate steps. Make sure the indexing pipeline completes before expecting RAG answers.
Expecting RAG to work without storage
Expecting RAG to work without storage
You need object storage configured so the app can store the uploaded files. See the Storage guide.
Forgetting that embeddings need OpenAI
Forgetting that embeddings need OpenAI
The embedding pipeline uses OpenAI regardless of which chat model the user picks. Set your
OPENAI_API_KEY.Assuming any Postgres host supports pgvector
Assuming any Postgres host supports pgvector
Not every provider has
pgvector enabled. Supabase supports it out of the box. Check your host’s docs if you’re using something else.Expecting perfect answers every time
Expecting perfect answers every time
RAG improves grounded answers, but it doesn’t replace good prompts, reasonable chunking, or careful product design. It gives the model better context — it doesn’t magically make every answer perfect.

