Start small, expand later
For most projects, this is all you need to get going:- OpenAI — covers chat, vision, images, audio, and structured output
- Storage — for file uploads
- Better Auth + PostgreSQL — for auth and data
OpenAI
Core chat, images, vision, audio, structured output. Your Swiss Army knife.
Anthropic
Claude models for chat. Great for reasoning and long-context tasks.
Google Gemini
Gemini models with search grounding and a strong default vision path.
Groq
Lightning-fast Llama models. Great for speed-sensitive features.
xAI
Grok models with reasoning capabilities.
DeepSeek
Additional reasoning and chat model options.
Replicate
Image, video, and transcription workflows.
ElevenLabs
Text-to-speech with 1000+ voices in 26+ languages.
Provider to feature mapping
Here’s a quick cheat sheet so you know exactly what each provider unlocks:| Provider | What it powers |
|---|---|
| OpenAI | Core chat, structured generation, embeddings, image features |
| Anthropic | Claude chat support |
| Google Gemini | Gemini chat, search grounding, default vision path |
| Groq | Fast Llama models for chat and structured generation |
| xAI | Grok models for chat |
| DeepSeek | Additional reasoning/chat models |
| Replicate | Image generation, video generation, and audio transcription |
| ElevenLabs | Voice Studio (text-to-speech) |

