← Back to blog Announcement

Run AI Models On-Device — Zero Config, Five Minutes

CLI, Rust, Flutter, Swift, Kotlin, Unity — run 25+ ML models on-device with one command. No tensor shapes, no preprocessing scripts.

Glenn Sonna
· · 3 min read
on-device-airun-ml-locallyrust-mledge-inferenceopen-source-ai

You already know why on-device AI matters. Privacy, latency, cost. You’ve read the guides.

Now you want to actually do it. Here’s what that looks like with Xybrid — no tensor shapes, no preprocessing scripts, no ML expertise.


Install

cargo install xybrid-cli

Text-to-Speech

xybrid run --model kokoro-82m --input "Hello from the edge" --output hello.wav

That’s it. Xybrid resolved the model from the registry, downloaded it, ran inference, and saved a WAV file. You configured nothing.

Kokoro is an 82M parameter TTS model with 24 voices. First run downloads ~80MB and caches it locally. Subsequent runs are instant.

Speech Recognition

xybrid run --model whisper-tiny --input recording.wav

Whisper Tiny transcribes audio in real-time on any modern laptop. Outputs plain text.

Text Generation

xybrid run --model qwen3.5-0.8b --input "Explain quantum computing in one sentence"

Qwen 3.5 0.8B runs locally via llama.cpp. 201 languages, fits in 500MB quantized.

Browse the Registry

xybrid models list

25+ models, all hosted on HuggingFace, downloaded on-demand, cached locally:

ModelTaskSizeNotes
kokoro-82mText-to-Speech82M24 voices, high quality
kitten-tts-nano-0.8Text-to-Speech15MUltra-lightweight
qwen3-tts-0.6bText-to-Speech600MMultilingual
whisper-tinySpeech Recognition39MReal-time, multilingual
wav2vec2-base-960hSpeech Recognition95MCTC-based
lfm2.5-350mText Generation354M9 languages, edge-optimized
smollm2-360mText Generation360MBest tiny LLM
qwen3.5-0.8bText Generation800M201 languages
gemma-4-e2bText Generation5.1BMultimodal
mistral-7bText Generation7BFunction calling

Beyond the CLI

The CLI is the fastest way to evaluate. When you’re ready to integrate into an app, Xybrid has SDKs for Flutter, Swift, Kotlin, Unity, and Rust — same models, same behavior, every platform.


Xybrid is in beta (v0.1.0-beta9), open-source under Apache 2.0.

GitHub: github.com/xybrid-ai/xybrid


Related

Related articles

· 12 min read

On-Device AI: The Complete Guide to Running ML Models Locally

Everything you need to know about running machine learning models directly on mobile and desktop devices — privacy, latency, cost benefits, and how to get started.

on-device-aiedge-inferencemobile-ml
· 8 min read

Add Text-to-Speech to Your Flutter App in 15 Minutes

A step-by-step guide to adding high-quality, on-device TTS to a Flutter app using Xybrid and the Kokoro model. No cloud APIs, no API keys, no per-request costs.

flutterttstutorial
· 11 min read

Building a Voice Agent That Runs Entirely On-Device

A step-by-step tutorial for building an on-device voice agent using Whisper, a local LLM, and Kokoro TTS — no cloud APIs, no internet required.

tutorialvoice-agenttts