GPT-4o & vision
Build variations of the Image Description Generator in minutes
The Image Description Generator uses GPT-4o to return structured JSON
output. Input details are defined in the toolConfig.ts
file, and the output
can be automatically displayed by the components/output/OutputLayout.tsx
file, regardless of the JSON structure.
Image Description Generator using GPT-4o
Pre-requisites
To run the app, you must have Supabase and OpenAI set up. If you haven’t done this yet, please start there.
Supabase
Set up user authentication & PostgreSQL database using Supabase
OpenAI
Set up OpenAI & use it’s various models throughout your app
Storage
Set up audio, pdf and image storage using Cloudflare R2
That’s all you need to get the app running.
Review lib/types/toolconfig.ts
to understand the various configuration
fields in the demo app.
Building variations
The app is designed for easy customization and variation generation because of:
- Automatic input capturing
- Automatic output rendering, regardless of JSON structure
Prompt and JSON schema
Start by creating a new prompt and JSON schema for your app.
Use the existing prompts and JSON schemas for guidance. Follow the same principles and structure as in the demo prompts.
Your prompt.ts
file should manage user input like this:
Take note of the input variables (descriptionType
). We’ll use these later.
Automatic input capturing
Update toolConfig.ts
to include:
- The input variables defined in the prompt
- Input variables in
prompt.ts
should matchname
intoolConfig.ts
. See example below.
- Input variables in
- The button text for the form
- The type of model used
The InputCapture
component will automatically include an upload form, upload it to Cloudflare R2, return an URL and send it to GPT-4o for analysis.
Ensure the type
field in toolConfig.ts
is specified correctly.
InputCapture.tsx
uses this to determine what to include (fields, file
uploads) and which API endpoint to call.
The page.tsx
page in /app
folder of our demo app will get the data from toolConfig.ts
and pass it to the InputCapture
component to automatically build a form based on this.
Automatic output rendering
The goal is to allow rapid app generation, input capture, and output display for fast prototyping. Then, you can refine the output display for a more polished presentation.
app/(apps)/vision/[id]/page.tsx
will automatically fetch data from Supabase based on the uuid
& render the JSON, no matter the structure.
The OutputLayout
component handles all the heavy lifting, automatically fetching and displaying the JSON. Review it for better understanding.