We had a rule: every platform gets the same inference behavior. Not “similar.” Not “equivalent.” The same.
Same model. Same preprocessing. Same output bytes. Whether you’re on an iPhone, an Android tablet, a Unity game, or a terminal.
The only way to guarantee that? One implementation, many bindings.
Here’s how we ship Xybrid to five platform SDKs from a single Rust core — and the FFI playbook we developed along the way.
The Problem
You want to add TTS to your app. You find a great model. Now:
- iOS team writes inference code in Swift
- Android team writes it in Kotlin
- Flutter team writes it in Dart
- Game team writes it in C#
- Backend team writes it in… Python? Rust? Node?
Five implementations. Five sets of bugs. Five preprocessing behaviors that should be identical but inevitably drift. Someone hard-codes a sample rate. Someone normalizes differently. Users get different audio quality on different platforms.
This is the N-platform problem. And FFI is the solution.
The Architecture
┌──────────────────────────────────────────┐
│ Platform SDKs │
│ │
│ ┌─────────┐ ┌───────┐ ┌────────┐ │
│ │ Flutter │ │ Swift │ │ Kotlin │ │
│ │ (Dart) │ │ │ │ │ │
│ └────┬─────┘ └───┬───┘ └───┬────┘ │
│ │ FRB │ UniFFI │ UniFFI │
│ ┌────┴───┐ ┌────┴────────┴────┐ │
│ │ Unity │ │ │ │
│ │ (C#) │ │ │ │
│ └────┬────┘ │ xybrid-sdk │ │
│ │ C FFI │ xybrid-core │ │
│ │ │ (Rust) │ │
│ └───────┤ │ │
│ └─────────────────┘ │
└──────────────────────────────────────────┘ One Rust core. Three FFI strategies. Five platform SDKs.
Strategy 1: flutter_rust_bridge (Flutter)
flutter_rust_bridge (FRB) auto-generates Dart bindings from Rust code. It’s the most sophisticated of our three FFI tools.
What It Handles
- Async functions → Dart
Futures - Streaming callbacks → Dart
Streams - Complex types → Dart classes
- Error types → Dart exceptions
- Memory management → automatic
The Rust API
// This is the FRB-visible API
pub fn run_model(model_id: String, input: ApiEnvelope) -> Result<ApiEnvelope> {
let model = load_model(&model_id)?;
let output = model.execute(&input.into())?;
Ok(output.into())
}
pub fn run_streaming(
model_id: String,
input: ApiEnvelope,
sink: StreamSink<ApiPartialToken>,
) -> Result<ApiEnvelope> {
let callback = move |token: PartialToken| {
sink.add(token.into()).is_ok()
};
// ...
} The Generated Dart API
// Auto-generated — we don't write this
Future<ApiEnvelope> runModel({
required String modelId,
required ApiEnvelope input,
}) async { ... }
Stream<ApiPartialToken> runStreaming({
required String modelId,
required ApiEnvelope input,
}) { ... } FRB handles the entire async dance: spawning the Rust code on an isolate, marshalling data across FFI, and surfacing results on the Dart side.
The High-Level Dart Wrapper
We add a thin, idiomatic Dart layer on top:
class Xybrid {
static Future<void> init() async { ... }
static XybridModelBuilder model({required String modelId}) => ...;
static XybridPipelineBuilder pipeline({required String yamlContent}) => ...;
}
// Usage
final model = await Xybrid.model(modelId: 'kokoro-82m').load();
final result = await model.run(envelope: Envelope.text(text: 'Hello')); Total Dart code: ~800 lines (wrapper + types). All inference logic is in Rust.
Strategy 2: UniFFI (Swift + Kotlin)
UniFFI (Mozilla’s tool) generates Swift and Kotlin bindings from a single Rust definition. It’s less magical than FRB but covers two platforms at once.
The UDL Definition
// xybrid.udl — defines the FFI surface
namespace xybrid {
[Throws=XybridError]
XybridModelHandle load_model(string model_id);
};
interface XybridModelHandle {
[Throws=XybridError]
XybridEnvelope run(XybridEnvelope input);
sequence<XybridVoiceInfo> voices();
};
dictionary XybridEnvelope {
string kind;
bytes data;
};
[Error]
enum XybridError {
"ModelNotFound",
"InferenceFailed",
"InvalidInput",
}; Generated Swift
// Auto-generated
let model = try loadModel(modelId: "kokoro-82m")
let result = try model.run(input: XybridEnvelope(kind: "text", data: textData))
let voices = model.voices() // [XybridVoiceInfo] Generated Kotlin
// Auto-generated
val model = loadModel(modelId = "kokoro-82m")
val result = model.run(input = XybridEnvelope(kind = "text", data = textData))
val voices = model.voices() // List<XybridVoiceInfo> One UDL file → two platform SDKs. The generated code is idiomatic — Swift gets optionals and throwing functions, Kotlin gets nullable types and exceptions.
Adding a Higher-Level Wrapper
Like Flutter, we wrap the raw bindings:
// Kotlin SDK wrapper
object XybridModelLoader {
fun fromRegistry(modelId: String): XybridModel {
val handle = loadModel(modelId = modelId)
return XybridModel(handle)
}
}
class XybridModel(private val handle: XybridModelHandle) {
fun run(input: Envelope): Envelope = handle.run(input.toFFI()).fromFFI()
fun voices(): List<VoiceInfo> = handle.voices().map { it.fromFFI() }
} Total wrapper code per platform: ~150 lines.
Strategy 3: C FFI + cbindgen (Unity)
Unity’s C# interop requires raw C headers. No fancy codegen — just extern "C" functions and DllImport.
The Rust C API
// xybrid-ffi/src/lib.rs
#[no_mangle]
pub extern "C" fn xybrid_load_model(
model_id: *const c_char,
out_handle: *mut *mut XybridHandle,
) -> i32 {
let model_id = unsafe { CStr::from_ptr(model_id) }.to_str().unwrap();
match load_model(model_id) {
Ok(handle) => {
unsafe { *out_handle = Box::into_raw(Box::new(handle)) };
0 // success
}
Err(_) => -1 // error
}
}
#[no_mangle]
pub extern "C" fn xybrid_free_model(handle: *mut XybridHandle) {
if !handle.is_null() {
unsafe { drop(Box::from_raw(handle)) };
}
} cbindgen generates the C header:
// xybrid.h (auto-generated)
int32_t xybrid_load_model(const char *model_id, XybridHandle **out_handle);
void xybrid_free_model(XybridHandle *handle); The C# Wrapper (Unity)
public class Xybrid {
[DllImport("xybrid_ffi")]
private static extern int xybrid_load_model(string modelId, out IntPtr handle);
[DllImport("xybrid_ffi")]
private static extern void xybrid_free_model(IntPtr handle);
public static XybridModel LoadModel(string modelId) {
IntPtr handle;
int result = xybrid_load_model(modelId, out handle);
if (result != 0) throw new Exception("Failed to load model");
return new XybridModel(handle);
}
} More boilerplate than UniFFI, but it’s the only option for Unity. The total C# wrapper is ~300 lines.
The FFI Decision Matrix
| FRB | UniFFI | C FFI | |
|---|---|---|---|
| Languages | Dart | Swift + Kotlin | Any (C ABI) |
| Async support | Native (Futures, Streams) | Manual | Manual |
| Type richness | High (enums, generics) | Medium (UDL types) | Low (C types) |
| Code generation | Automatic from Rust | From UDL definition | From Rust (cbindgen) |
| Memory management | Automatic | Automatic | Manual (alloc/free) |
| Best for | Flutter | Mobile native SDKs | Game engines, C interop |
If we were starting over and only needed one tool, UniFFI covers the most ground. But FRB’s async/streaming support is essential for Flutter, and C FFI is the only option for Unity.
Lessons Learned
1. Keep the FFI Surface Small
Every function in the FFI layer is a maintenance point across all platforms. We minimize it:
FFI surface:
- init()
- load_model(id) → handle
- run(handle, input) → output
- run_streaming(handle, input) → stream
- voices(handle) → list
- warmup(handle)
- free(handle) ~7 functions serve the entire SDK. All complexity is inside Rust.
2. Use Wrapper Types, Not Domain Types
Don’t expose internal types across FFI. Use flat, serializable wrapper types:
// Internal (rich, nested)
pub struct Envelope {
pub kind: EnvelopeKind,
pub metadata: HashMap<String, String>,
}
// FFI (flat, simple)
pub struct ApiEnvelope {
pub kind: String, // "text", "audio", "embedding"
pub data: Vec<u8>, // serialized payload
} Conversion happens at the boundary. Internal types can evolve without breaking FFI.
3. Test Across Platforms in CI
We run platform-specific CI for each binding:
# .github/workflows/
build-flutter.yml # Flutter build + Dart analysis
build-apple.yml # XCFramework + Swift package
build-android.yml # AAR + Kotlin compilation + ORT verification
build-unity.yml # Native lib + C header generation A Rust change that breaks any platform is caught before merge.
4. API Contract as Source of Truth
We maintain api-surface.yaml that defines the public API across all SDKs:
methods:
- name: init
platforms: [flutter, swift, kotlin, unity, cli]
status: stable
- name: model.run
platforms: [flutter, swift, kotlin, unity, cli]
args:
- name: envelope
type: Envelope
- name: voiceId
type: string?
- name: generationConfig
type: GenerationConfig?
status: stable SDK implementations are validated against this contract. No platform can drift.
The Payoff
From one Rust codebase:
| Platform | SDK | Binding Lines | Full Feature Parity |
|---|---|---|---|
| Flutter | xybrid_flutter | ~800 | Yes |
| Swift | Xybrid (SPM) | ~400 | Yes |
| Kotlin | xybrid-kotlin (Maven) | ~400 | Yes |
| Unity | Xybrid.cs | ~300 | Yes |
| CLI | xybrid-cli | ~500 | Yes |
~2,400 lines of binding code for five platforms. The Rust core is ~15,000 lines. That’s a 6:1 ratio of shared-to-platform-specific code.
One model, one metadata file, one preprocessing pipeline, one set of tests — five platforms that behave identically.
Explore the SDKs: github.com/xybrid-ai/xybrid
Building cross-platform AI features? What’s your FFI strategy? Share in the comments.