Transcription Models
Local and cloud transcription models — Whisper, Parakeet, and cloud providers.
VivaDicta supports a wide range of transcription engines — both local models that run entirely on your iPhone or iPad and cloud providers that offer speed and accuracy.
Local Models
Local models process audio directly on your device. No data leaves your iPhone, and they work without an internet connection.
- Whisper — OpenAI's Whisper model optimized for Apple hardware. Multiple model sizes available (we recommend Large Turbo). Best balance of accuracy and speed on modern iPhones.
- Parakeet — NVIDIA's speech recognition model running via FluidAudio. Fast and accurate, optimized for Apple Silicon.
Cloud Providers
Cloud providers process audio on remote servers. They tend to be faster (especially on older devices) and often more accurate, but require an internet connection and an API key.
- Groq — free forever, ultra-fast Whisper on custom LPU hardware. Our #1 recommendation.
- Deepgram — Nova-3 model with excellent accuracy. $200 free credits for new accounts.
- ElevenLabs — Scribe v2 with support for 99 languages. Free tier available.
- Gemini — Google's multimodal model with speech transcription capabilities.
- Mistral — European AI provider with transcription support.
- Soniox — high-accuracy cloud transcription.
- OpenAI-compatible — connect any provider that supports the OpenAI Whisper API format.
Local vs Cloud
| Local | Cloud | |
|---|---|---|
| Cost | Free | Free tiers or pay-per-use |
| Privacy | Full — nothing leaves your device | Audio sent to provider servers |
| Internet | Not required | Required |
| Speed | Depends on device hardware | Consistently fast |
| Accuracy | Good to excellent | Excellent |
| Setup | Model download required | API key required |
Language Support
Language support varies by model. Most models support 50-100+ languages. Whisper (local) and Groq (cloud, also using Whisper) support 100+ languages. ElevenLabs Scribe v2 supports 99 languages.
See Recommended Models for our top picks based on your use case.