We’ll use Langchain, OpenAI, Cloudflare R2 storage & Supabase vector storage to create our own Chat with a PDF app. You’ll have it running in 5 minutes!

Chat and ask questions to your PDF

Pre-requisites

To build your own Chat with PDF app you’ll need to have Supabase, OpenAI and Storage set up. If you haven’t, please start by doing that.

OpenAI

Set up OpenAI & use it’s various models throughout your app

Once that is ready, you can go ahead and follow the steps below.

Database Setup

The Chat with PDF feature requires three tables that should already exist if you followed the Quick Setup guide and ran the Supabase migrations:

npx supabase db push

These tables are:

  1. conversations: Stores chat history and metadata for each PDF conversation
  2. documents: Manages PDF file information and links to conversations
  3. embeddings: Stores vector embeddings of PDF chunks for semantic search

If you haven’t run the migrations yet, please go back to the Quick Setup guide first. Alternatively, you can manually create the tables by running the SQL from this file in your Supabase SQL editor:

  • supabase/migrations/20240000000005_pdf.sql

The tables include Row Level Security (RLS) policies to ensure users can only access their own data.

App Structure

  1. app/api/(apps)/pdf/* Key API routes:

    • app/api/(apps)/pdf/chat/route.ts: Interacts with OpenAI for chatting. You can customize the model and prompt. It streams responses to the user and updates the conversations table in Supabase.
    • app/api/(apps)/pdf/upload/route.ts: Uploads PDFs to Cloudflare R2, returns a public URL, and updates the documents table. It also checks the maximum number of files per user, and reduces user credits.
    • app/api/(apps)/pdf/vectorize/route.ts: Splits the uploaded PDF into chunks, vectorizes it, and stores the data in Supabase with metadata.
    • app/api/(apps)/pdf/delete/route.ts: Deletes data from both Supabase and Cloudflare R2.
    • app/api/(apps)/pdf/externaldoc/route.ts: Handles non-file uploads for online PDFs.
  2. /app/(apps)/pdf/* Contains all front-end logic, including paywall checks and pages for chatting with different PDFs based on document ID.

  3. /components/pdf/* Contains all front-end components for the PDF app, including the chat interface components. The chat components have been unified into this folder for better organization and reusability.

If you have any questions, feel free to reach out to me on Twitter or Discord!