Experimental Pattern
This is an emerging architecture pattern. Example apps coming soon as we explore this approach.
Edge-Native Architecture
AI on the device—instant, private, offline
What is it?
AI models run directly on user devices (phones, laptops, IoT) using on-device inference engines like WebGPU, Core ML, and ONNX Runtime. No network calls, no API keys—everything runs locally on the device.
This pattern enables zero-latency interactions with absolute privacy. The AI never leaves your device, making it ideal for sensitive applications, offline scenarios, and environments where connectivity is unreliable.
💡 Key Insight
"The fastest network request is the one you never make. Edge-native AI trades model size for zero latency and absolute privacy."
Tradeoffs
Advantages
- Zero latency - no network round-trip
- Absolute privacy - data never leaves device
- Works offline completely
- No API costs after model download
- Better for regulated industries (HIPAA, GDPR)
Tradeoffs
- Model size constraints (storage, memory)
- Limited to simpler models (1-7B params typically)
- Device capability variance (older devices struggle)
- Initial download size can be large
- Model updates require new downloads
Technical Deep Dive
Architecture
Edge-native AI leverages device-specific inference engines to run quantized models locally. Modern browsers and mobile platforms now support hardware-accelerated AI inference.
- •Models: Quantized/compressed LLMs (1-7B parameters, 4-bit or 8-bit)
- •Inference: WebGPU, WebNN, TensorFlow Lite, Core ML, ONNX Runtime
- •State: Local IndexedDB, SQLite on device
- •Sync: Optional background sync when online (for updates only)
When to Use This Pattern
- ✓Privacy is non-negotiable (healthcare, finance, legal)
- ✓Offline capability is critical (travel, remote work)
- ✓Latency must be under 100ms
- ✓User base willing to download larger apps (100MB+)
- ✓Compliance with data residency requirements
When NOT to Use This Pattern
- ✗Need latest/largest models (GPT-4, Claude-3 class)
- ✗Frequent model updates required
- ✗Users on low-end devices or limited storage
- ✗Need server-side tool use or API integrations
Example App Concepts
NotesWithOtto
Coming SoonLocal note-taking with AI suggestions (offline-first)
TranslateWithOtto
Coming SoonReal-time translation using on-device models
HealthWithOtto
Coming SoonMedical symptom checker with complete privacy
Want to explore other architecture patterns?
View All Patterns