← Back to blog Engineering

One Codebase, Five Platforms: Shipping ML to Flutter, Swift, Kotlin, Unity & CLI

How UniFFI, flutter_rust_bridge, and C FFI let us write ML inference once in Rust and bind it to five platform SDKs. The cross-platform FFI playbook.

Glenn Sonna

· June 5, 2026 · 8 min read

rustflutterkotlingamedev

We had a rule: every platform gets the same inference behavior. Not “similar.” Not “equivalent.” The same.

Same model. Same preprocessing. Same output bytes. Whether you’re on an iPhone, an Android tablet, a Unity game, or a terminal.

The only way to guarantee that? One implementation, many bindings.

Here’s how we ship Xybrid to five platform SDKs from a single Rust core — and the FFI playbook we developed along the way.

The Problem

You want to add TTS to your app. You find a great model. Now:

iOS team writes inference code in Swift
Android team writes it in Kotlin
Flutter team writes it in Dart
Game team writes it in C#
Backend team writes it in… Python? Rust? Node?

Five implementations. Five sets of bugs. Five preprocessing behaviors that should be identical but inevitably drift. Someone hard-codes a sample rate. Someone normalizes differently. Users get different audio quality on different platforms.

This is the N-platform problem. And FFI is the solution.

The Architecture

┌──────────────────────────────────────────┐
│            Platform SDKs                  │
│                                          │
│  ┌─────────┐ ┌───────┐ ┌────────┐       │
│  │ Flutter  │ │ Swift │ │ Kotlin │       │
│  │  (Dart)  │ │       │ │        │       │
│  └────┬─────┘ └───┬───┘ └───┬────┘       │
│       │ FRB       │ UniFFI  │ UniFFI     │
│  ┌────┴───┐  ┌────┴────────┴────┐       │
│  │  Unity  │  │                  │       │
│  │  (C#)   │  │                  │       │
│  └────┬────┘  │   xybrid-sdk    │       │
│       │ C FFI │   xybrid-core   │       │
│       │       │     (Rust)      │       │
│       └───────┤                 │       │
│               └─────────────────┘       │
└──────────────────────────────────────────┘

One Rust core. Three FFI strategies. Five platform SDKs.

Strategy 1: flutter_rust_bridge (Flutter)

flutter_rust_bridge (FRB) auto-generates Dart bindings from Rust code. It’s the most sophisticated of our three FFI tools.

What It Handles

Async functions → Dart Futures
Streaming callbacks → Dart Streams
Complex types → Dart classes
Error types → Dart exceptions
Memory management → automatic

The Rust API

// This is the FRB-visible API
pub fn run_model(model_id: String, input: ApiEnvelope) -> Result<ApiEnvelope> {
    let model = load_model(&model_id)?;
    let output = model.execute(&input.into())?;
    Ok(output.into())
}

pub fn run_streaming(
    model_id: String,
    input: ApiEnvelope,
    sink: StreamSink<ApiPartialToken>,
) -> Result<ApiEnvelope> {
    let callback = move |token: PartialToken| {
        sink.add(token.into()).is_ok()
    };
    // ...
}

The Generated Dart API

// Auto-generated — we don't write this
Future<ApiEnvelope> runModel({
  required String modelId,
  required ApiEnvelope input,
}) async { ... }

Stream<ApiPartialToken> runStreaming({
  required String modelId,
  required ApiEnvelope input,
}) { ... }

FRB handles the entire async dance: spawning the Rust code on an isolate, marshalling data across FFI, and surfacing results on the Dart side.

The High-Level Dart Wrapper

We add a thin, idiomatic Dart layer on top:

class Xybrid {
  static Future<void> init() async { ... }
  static XybridModelBuilder model({required String modelId}) => ...;
  static XybridPipelineBuilder pipeline({required String yamlContent}) => ...;
}

// Usage
final model = await Xybrid.model(modelId: 'kokoro-82m').load();
final result = await model.run(envelope: Envelope.text(text: 'Hello'));

Total Dart code: ~800 lines (wrapper + types). All inference logic is in Rust.

Strategy 2: UniFFI (Swift + Kotlin)

UniFFI (Mozilla’s tool) generates Swift and Kotlin bindings from a single Rust definition. It’s less magical than FRB but covers two platforms at once.

The UDL Definition

// xybrid.udl — defines the FFI surface
namespace xybrid {
    [Throws=XybridError]
    XybridModelHandle load_model(string model_id);
};

interface XybridModelHandle {
    [Throws=XybridError]
    XybridEnvelope run(XybridEnvelope input);

    sequence<XybridVoiceInfo> voices();
};

dictionary XybridEnvelope {
    string kind;
    bytes data;
};

[Error]
enum XybridError {
    "ModelNotFound",
    "InferenceFailed",
    "InvalidInput",
};

Generated Swift

// Auto-generated
let model = try loadModel(modelId: "kokoro-82m")
let result = try model.run(input: XybridEnvelope(kind: "text", data: textData))

let voices = model.voices() // [XybridVoiceInfo]

Generated Kotlin

// Auto-generated
val model = loadModel(modelId = "kokoro-82m")
val result = model.run(input = XybridEnvelope(kind = "text", data = textData))

val voices = model.voices() // List<XybridVoiceInfo>

One UDL file → two platform SDKs. The generated code is idiomatic — Swift gets optionals and throwing functions, Kotlin gets nullable types and exceptions.

Adding a Higher-Level Wrapper

Like Flutter, we wrap the raw bindings:

// Kotlin SDK wrapper
object XybridModelLoader {
    fun fromRegistry(modelId: String): XybridModel {
        val handle = loadModel(modelId = modelId)
        return XybridModel(handle)
    }
}

class XybridModel(private val handle: XybridModelHandle) {
    fun run(input: Envelope): Envelope = handle.run(input.toFFI()).fromFFI()
    fun voices(): List<VoiceInfo> = handle.voices().map { it.fromFFI() }
}

Total wrapper code per platform: ~150 lines.

Strategy 3: C FFI + cbindgen (Unity)

Unity’s C# interop requires raw C headers. No fancy codegen — just extern "C" functions and DllImport.

The Rust C API

// xybrid-ffi/src/lib.rs

#[no_mangle]
pub extern "C" fn xybrid_load_model(
    model_id: *const c_char,
    out_handle: *mut *mut XybridHandle,
) -> i32 {
    let model_id = unsafe { CStr::from_ptr(model_id) }.to_str().unwrap();
    match load_model(model_id) {
        Ok(handle) => {
            unsafe { *out_handle = Box::into_raw(Box::new(handle)) };
            0 // success
        }
        Err(_) => -1 // error
    }
}

#[no_mangle]
pub extern "C" fn xybrid_free_model(handle: *mut XybridHandle) {
    if !handle.is_null() {
        unsafe { drop(Box::from_raw(handle)) };
    }
}

cbindgen generates the C header:

// xybrid.h (auto-generated)
int32_t xybrid_load_model(const char *model_id, XybridHandle **out_handle);
void xybrid_free_model(XybridHandle *handle);

The C# Wrapper (Unity)

public class Xybrid {
    [DllImport("xybrid_ffi")]
    private static extern int xybrid_load_model(string modelId, out IntPtr handle);

    [DllImport("xybrid_ffi")]
    private static extern void xybrid_free_model(IntPtr handle);

    public static XybridModel LoadModel(string modelId) {
        IntPtr handle;
        int result = xybrid_load_model(modelId, out handle);
        if (result != 0) throw new Exception("Failed to load model");
        return new XybridModel(handle);
    }
}

More boilerplate than UniFFI, but it’s the only option for Unity. The total C# wrapper is ~300 lines.

The FFI Decision Matrix

	FRB	UniFFI	C FFI
Languages	Dart	Swift + Kotlin	Any (C ABI)
Async support	Native (Futures, Streams)	Manual	Manual
Type richness	High (enums, generics)	Medium (UDL types)	Low (C types)
Code generation	Automatic from Rust	From UDL definition	From Rust (cbindgen)
Memory management	Automatic	Automatic	Manual (alloc/free)
Best for	Flutter	Mobile native SDKs	Game engines, C interop

If we were starting over and only needed one tool, UniFFI covers the most ground. But FRB’s async/streaming support is essential for Flutter, and C FFI is the only option for Unity.

Lessons Learned

1. Keep the FFI Surface Small

Every function in the FFI layer is a maintenance point across all platforms. We minimize it:

FFI surface:
  - init()
  - load_model(id) → handle
  - run(handle, input) → output
  - run_streaming(handle, input) → stream
  - voices(handle) → list
  - warmup(handle)
  - free(handle)

~7 functions serve the entire SDK. All complexity is inside Rust.

2. Use Wrapper Types, Not Domain Types

Don’t expose internal types across FFI. Use flat, serializable wrapper types:

// Internal (rich, nested)
pub struct Envelope {
    pub kind: EnvelopeKind,
    pub metadata: HashMap<String, String>,
}

// FFI (flat, simple)
pub struct ApiEnvelope {
    pub kind: String,    // "text", "audio", "embedding"
    pub data: Vec<u8>,   // serialized payload
}

Conversion happens at the boundary. Internal types can evolve without breaking FFI.

3. Test Across Platforms in CI

We run platform-specific CI for each binding:

# .github/workflows/
build-flutter.yml    # Flutter build + Dart analysis
build-apple.yml      # XCFramework + Swift package
build-android.yml    # AAR + Kotlin compilation + ORT verification
build-unity.yml      # Native lib + C header generation

A Rust change that breaks any platform is caught before merge.

4. API Contract as Source of Truth

We maintain api-surface.yaml that defines the public API across all SDKs:

methods:
  - name: init
    platforms: [flutter, swift, kotlin, unity, cli]
    status: stable

  - name: model.run
    platforms: [flutter, swift, kotlin, unity, cli]
    args:
      - name: envelope
        type: Envelope
      - name: voiceId
        type: string?
      - name: generationConfig
        type: GenerationConfig?
    status: stable

SDK implementations are validated against this contract. No platform can drift.

The Payoff

From one Rust codebase:

Platform	SDK	Binding Lines	Full Feature Parity
Flutter	xybrid_flutter	~800	Yes
Swift	Xybrid (SPM)	~400	Yes
Kotlin	xybrid-kotlin (Maven)	~400	Yes
Unity	Xybrid.cs	~300	Yes
CLI	xybrid-cli	~500	Yes

~2,400 lines of binding code for five platforms. The Rust core is ~15,000 lines. That’s a 6:1 ratio of shared-to-platform-specific code.

One model, one metadata file, one preprocessing pipeline, one set of tests — five platforms that behave identically.

Explore the SDKs: github.com/xybrid-ai/xybrid

Building cross-platform AI features? What’s your FFI strategy? Share in the comments.

Jul 7, 2026 · 6 min read

Streaming LLM Tokens from Rust to Flutter in Real-Time

How we bridge real-time token streaming from a Rust inference engine to Flutter's reactive UI using flutter_rust_bridge callbacks.

flutterrustai

Jul 14, 2026 · 7 min read

Vendoring llama.cpp in a Rust Workspace (Lessons Learned)

The practical pain of embedding a C++ inference engine in a Rust monorepo — cross-compilation, Android fp16, think-tag stripping, and when to give up on upstream.

rustcppai

Jun 12, 2026 · 7 min read

Building a Cross-Platform ML Inference SDK in Rust

How we built a single Rust core that powers ML inference across CLI, Flutter, Swift, Kotlin, and Unity — and the architectural decisions that made it possible.

rustaiarchitecture