Minimum Requirements
| Component | Minimum | Recommended |
|---|---|---|
| CPU | Any modern x86_64 | AMD Ryzen 7 / Intel i7 |
| RAM | 16 GB | 24 GB |
| GPU | Vulkan-capable iGPU | Dedicated GPU (6+ GB VRAM) |
| Disk | 10 GB free | 20 GB (SSD recommended) |
| OS | Ubuntu 24.04+ | Ubuntu 24.04 LTS |
| Python | 3.12+ | 3.12 |
What Runs Where
| Component | Runs On | Resource Usage |
|---|---|---|
| LLM (Qwen 2.5 3B) | GPU (Vulkan) | 2.8 GB VRAM |
| All neural networks | CPU (PyTorch) | ~200 MB RAM total |
| 25 SQLite databases | Disk (SSD helps) | ~500 MB |
| 25+ systemd services | CPU | <5% idle, 30-50% active |
| Web UI (Three.js swarm) | Browser GPU | Minimal |
Tested Hardware
Development Machine (what Frank runs on daily):
- AMD Ryzen 7 8845HS (Phoenix1)
- 16 GB DDR5
- AMD Radeon 780M iGPU (Vulkan)
- 512 GB NVMe SSD
- Ubuntu 24.04 LTS
Performance: 168.9 tok/s prompt, 12.7 tok/s generation, ~4-6 GB total RAM usage.
GPU Options
- Integrated GPU (AMD iGPU): Works. ~12 tok/s generation. Free.
- NVIDIA (4060+): Excellent. 80-100+ tok/s. CUDA or Vulkan.
- AMD discrete: Good. Vulkan backend. 50-80 tok/s estimated.
- CPU only: Possible. 6 tok/s. Functional but slow.
llama.cpp handles all backends. No driver configuration needed beyond standard Vulkan/CUDA setup.