Skip to main content
This repo includes a real document-chat pipeline, not just a fake “upload a PDF” button. Users can attach documents, index them, ask questions, and get answers with citations.

First, What Is RAG?

RAG stands for retrieval-augmented generation. In simple terms, it means:
  1. your app stores useful source material, like PDFs
  2. when the user asks a question, the app finds the most relevant pieces
  3. those pieces are passed into the model as context
  4. the model answers using that context
This is how the chat app can answer questions about your documents, not just what the model already knew before today.

What The Vector Database Does

The vector database is what makes “semantic search” possible. Instead of storing only raw text, the app stores:
  • the original chunk of text
  • metadata such as document and page information
  • a vector embedding for that chunk
That embedding is a numeric representation of meaning. It lets the app find chunks that are conceptually similar to a question, even when the wording is different. In AnotherWrapper, this is implemented with:
  • Supabase PostgreSQL
  • pgvector
  • OpenAI embeddings

What The User Experience Looks Like

From a user’s point of view, the flow is simple:
  1. upload a PDF in chat
  2. let the app index it
  3. ask a question in normal language
  4. get an answer with source citations
Under the hood, several things happen to make that feel easy.

How The Pipeline Works

1. Upload

The PDF file is uploaded and tracked in the database.

2. Text extraction

The repo extracts readable text from the PDF so the AI system has something it can work with.

3. Chunking

The extracted text is split into smaller sections. This matters because models and vector search work much better on smaller chunks than one giant document.

4. Embedding

Each chunk is turned into an embedding using OpenAI’s embedding model.

5. Storage

Those embeddings are stored in Supabase in the embeddings table, alongside document metadata.

6. Retrieval

When the user asks a question, the question is embedded too. The app then runs similarity search to find the closest matching chunks.

7. Prompt injection

The best chunks are assembled into prompt-ready context and passed into the chat generation flow.

8. Citations

The answer includes source metadata so the UI can show citations back to the user.

Main Tables And Data

The exact schema is created by the Supabase migrations, but the important data concepts are:
  • pdf_documents for uploaded documents and indexing status
  • embeddings for chunk vectors and metadata
  • chat_document_links for linking documents to chat sessions
This is important because it means documents are not floating around as random files. They are part of a real, queryable data model.

Where The Logic Lives

If you want to customize or debug the retrieval system, these files matter most:
  • lib/rag/pdf-extract.ts for PDF text extraction
  • lib/rag/chunking.ts for chunk creation
  • lib/rag/embedding-ingest.ts for embedding + persistence
  • lib/rag/retrieve.ts for semantic retrieval
  • lib/rag/citations.ts for citation formatting
  • app/(apps)/chat/api/chat/chat-generation.ts for injecting RAG into chat

Why OpenAI Is Still Important Here

Even if your users mostly chat with Claude, Gemini, Grok, or DeepSeek, the current document pipeline still uses OpenAI embeddings for indexing and retrieval. So if you want PDF RAG to work, you should treat OpenAI as required for this feature.

What You Need To Set It Up

What Makes This Useful

This setup is useful when you want:
  • PDF question answering
  • internal knowledge assistants
  • customer support bots grounded in docs
  • contract or policy lookup
  • document-aware research workflows
It turns the chat app from “just another assistant” into something that can work with user-provided knowledge.

Common Mistakes

  • uploading PDFs but never indexing them
  • expecting RAG to work without storage
  • forgetting that embeddings currently depend on OpenAI
  • assuming the model will cite sources if retrieval did not find anything useful
  • treating RAG like a guarantee of truth instead of a retrieval aid
RAG improves grounded answers, but it does not replace good prompts, reasonable chunking, or careful product design. It gives the model better context. It does not magically make every answer perfect.