MLX Models
Qwen 3.5, GPT OSS, Ministral — run open-source LLMs locally on Apple Silicon.
Apple MLX lets you run powerful open-source LLMs directly on your Mac for AI text processing — completely free, fully private, no internet required. Models are optimized for Apple Silicon with Metal GPU acceleration and managed directly inside VivaDicta with one-click download.
Qwen 3.5 — Latest Models
VivaDicta includes the newest Qwen 3.5 model family from Alibaba — one of the strongest open-source LLM families available. Qwen 3.5 delivers excellent quality for text processing, translation, summarization, and coding tasks.
| Model | Best For | Min RAM |
|---|---|---|
| Qwen 3.5 4B | Fast processing on any Apple Silicon Mac | 16 GB |
| Qwen 3.5 9B | Best balance of speed and quality | 32 GB |
| Qwen 3.5 27B | High-quality results, near cloud-level | 48 GB |
| Qwen 3.5 35B-A3B (MoE) | Large model with fast inference via Mixture of Experts | 48 GB |
Other Available Models
| Model | Best For | Min RAM |
|---|---|---|
| OpenAI GPT OSS 20B | OpenAI's open-source model, strong general performance | 32 GB |
| Ministral 3B | Lightweight, fast on 16 GB Macs | 16 GB |
| Ministral 8B | Mistral's compact model, good quality | 32 GB |
Quantization Variants
Most models offer three quantization levels — choose based on your available RAM and quality needs:
- Base (4-bit) — smallest download, fastest inference, slightly lower quality. Best if RAM is tight.
- Med (6-bit) — balanced quality and speed. Recommended for most users.
- High (8-bit) — best quality, closest to the original model. Requires more RAM.
Getting Started
- Open VivaDicta settings → AI Provider → Apple MLX.
- Browse models organized by your Mac's RAM tier (16 GB / 32 GB / 48+ GB). A Recommended badge highlights the best model for your hardware.
- Click Download — models are downloaded once and stored locally.
- Select the downloaded model as your active AI provider.
- Done — all AI processing now runs locally on your Mac.
Managing Models
- The model browser shows your Mac's memory, storage used by models, available disk space, and download count.
- Delete models you no longer need to free up disk space.
- Switch between downloaded models anytime — no re-download needed.
System Requirements
- Apple Silicon Mac (M1 or later) — required for MLX.
- RAM — varies by model (see tables above). The model browser shows which models fit your Mac.
- Disk space — models range from ~2 GB to ~20 GB depending on size and quantization.
MLX vs Cloud AI
| Feature | MLX (Local) | Cloud (Claude, GPT, etc.) |
|---|---|---|
| Cost | Free forever | API fees or subscription |
| Privacy | 100% on-device | Text sent to provider |
| Internet | Not needed (after download) | Required |
| Quality | Good to excellent (model-dependent) | Excellent |
| Speed | Depends on Mac hardware | Fast (server-side) |
For a comparison with Apple Intelligence and Ollama, see Local AI Processing.