Docs / SDK

SDK Quickstart

Start local first. Add managed mode when you need device registration, rollouts, and telemetry.

1. Install

Add the dependencies to your build.gradle.kts.

build.gradle.kts

implementation("dev.deviceai:core:0.0.1")
implementation("dev.deviceai:speech:0.0.1")
implementation("dev.deviceai:llm:0.0.1")

2. Initialize

The SDK auto-detects device hardware. No manual configuration needed.

Initialize (local mode — no API key needed)

// Application.onCreate()
PlatformStorage.initialize(this)
DeviceAI.initialize(context = this)

3. Speech-to-Text

Powered by whisper.cpp. 7x faster than real-time on mid-range hardware.

Speech-to-Text

SpeechBridge.initStt(modelPath, SttConfig(language = "en", useGpu = true))
val text = SpeechBridge.transcribeAudio(samples)  // FloatArray, 16kHz mono
SpeechBridge.shutdownStt()

4. Text-to-Speech

Powered by sherpa-onnx. Supports VITS and Kokoro voice models.

Text-to-Speech

SpeechBridge.initTts(modelPath, tokensPath, TtsConfig(speechRate = 1.0f))
val pcm: ShortArray = SpeechBridge.synthesize("Hello from DeviceAI.")
SpeechBridge.shutdownTts()

5. LLM Chat

Powered by llama.cpp. Any GGUF model. Vulkan GPU.

LLM Chat (streaming)

val session = DeviceAI.llm.chat("/path/to/model.gguf") {
    systemPrompt = "You are a helpful assistant."
    temperature = 0.7f
}

session.send("What is Kotlin?").collect { token ->
    print(token)
}

session.close()

6. Offline RAG

BM25 keyword retrieval — no embedding model needed. Runs entirely on-device.

Offline RAG

val store = BM25RagStore(rawChunks = listOf(
    "DeviceAI runs inference on-device.",
    "Uses llama.cpp with Vulkan GPU acceleration."
))
val session = DeviceAI.llm.chat(modelPath) { ragStore = store }
session.send("What GPU does DeviceAI use?").collect { print(it) }

Scale across devices

Add managed mode

The SDK is production-ready without a backend. When you need orchestration across your device fleet, add an API key to unlock:

Device registration

Auto hardware profiling + capability tier assignment

Model manifest

Assign the right model per device tier, synced every 6h

OTA rollouts

Canary → rollout → full — or instant rollback

Telemetry

Latency, TTFT, tokens/sec — no prompts or audio collected

Kotlin — one line change

DeviceAI.initialize(context = this, apiKey = "<YOUR_API_KEY>") {
    telemetry = TelemetryLevel.Minimal
    appVersion = BuildConfig.VERSION_NAME
}

Invite-only during alpha. We review every request.