Vector Database & RAG - AnotherWrapper

This isn’t a fake “upload a PDF” button. It’s a real document-chat pipeline. Users upload documents, the app indexes them, and they get answers with citations. Here’s how it all works.

What Is RAG?

RAG stands for Retrieval-Augmented Generation. In plain English:

Your app stores source material (like PDFs)
When a user asks a question, the app finds the most relevant pieces
Those pieces get passed into the model as context
The model answers using that context

This is how the chat app answers questions about your documents — not just what the model already knew.

How It Works
Setup

The Pipeline, Step by Step

Upload

The user uploads a PDF in chat. The file is stored and tracked in the database.

Text extraction

The repo extracts readable text from the PDF so the AI system has something to work with.

Chunking

The text is split into smaller sections. Models and vector search work much better on focused chunks than one giant blob.

Embedding

Each chunk is turned into a vector embedding using OpenAI’s embedding model. Think of it as converting meaning into numbers.

Storage

Those embeddings are stored in PostgreSQL in the embeddings table, alongside document metadata.

Retrieval

When the user asks a question, the question is embedded too. The app runs similarity search to find the closest matching chunks.

Prompt injection

The best chunks are assembled into prompt-ready context and passed into the chat generation flow.

Citations

The answer includes source metadata so the UI can show citations back to the user. No more “trust me, bro.”

The Vector Database

The vector database is what makes semantic search possible. Instead of just storing raw text, the app stores:

The original chunk of text
Metadata (document ID, page number, etc.)
A vector embedding for that chunk

That embedding is a numeric representation of meaning. It lets the app find chunks that are conceptually similar to a question, even when the wording is completely different.This is all implemented with PostgreSQL + pgvector + OpenAI embeddings.

pgvector is a Postgres extension enabled at the database level. Your host must support CREATE EXTENSION vector for this to work.

What You Need

Auth + PostgreSQL

For auth, hosted Postgres, and the documented setup path.

Storage

For uploaded documents and file-backed workflows.

OpenAI

For embeddings in the RAG pipeline.

Even if your users mostly chat with Claude, Gemini, Grok, or DeepSeek, the document pipeline uses OpenAI embeddings for indexing and retrieval. Treat OpenAI as required for this feature.

Key Tables

Table	Purpose
`pdf_documents`	Uploaded documents and indexing status
`embeddings`	Chunk vectors and metadata
`chat_document_links`	Links documents to chat sessions

Documents aren’t floating around as random files. They’re part of a real, queryable data model.

Key Files

File	Purpose
`lib/rag/pdf-extract.ts`	PDF text extraction
`lib/rag/chunking.ts`	Chunk creation
`lib/rag/embedding-ingest.ts`	Embedding + persistence
`lib/rag/retrieve.ts`	Semantic retrieval
`lib/rag/citations.ts`	Citation formatting
`app/(apps)/chat/api/chat/chat-generation.ts`	Injecting RAG into chat

What You Can Build With This

This setup is perfect for:

PDF question answering
Internal knowledge assistants
Customer support bots grounded in your docs
Contract or policy lookup
Document-aware research workflows

It turns the chat app from “just another assistant” into something that works with user-provided knowledge.

Common Mistakes

Uploading PDFs but never indexing them

The upload and indexing are separate steps. Make sure the indexing pipeline completes before expecting RAG answers.

Expecting RAG to work without storage

You need object storage configured so the app can store the uploaded files. See the Storage guide.

Forgetting that embeddings need OpenAI

The embedding pipeline uses OpenAI regardless of which chat model the user picks. Set your OPENAI_API_KEY.

Assuming any Postgres host supports pgvector

Not every provider has pgvector enabled. Supabase supports it out of the box. Check your host’s docs if you’re using something else.

Expecting perfect answers every time

RAG improves grounded answers, but it doesn’t replace good prompts, reasonable chunking, or careful product design. It gives the model better context — it doesn’t magically make every answer perfect.

​What Is RAG?

​The Pipeline, Step by Step

​The Vector Database

​What You Need

Auth + PostgreSQL

Storage

OpenAI

​Key Tables

​Key Files

​What You Can Build With This

​Common Mistakes

What Is RAG?

The Pipeline, Step by Step

The Vector Database

What You Need

Key Tables

Key Files

What You Can Build With This

Common Mistakes