Skip to content
mimi

Mobile AI Engineer - Swift & Kotlin - India

Skarvo

Remote (Global) Full-time Today

About the role

Your Location and Job

• You will work in a remote-first work environment from anywhere in India.

• One year full-time contract with a possibility to move into a full-time employee.

• We offer competitive salary.

• Founders of Skarvo previously worked at Apple as senior engineers and designers. They are graduates of MIT.

• Skarvo founders are located in Silicon Valley, California.

The Role

• We're looking for a Mobile AI Engineer expert in Swift and Kotlin to build the runtime infrastructure that makes our on-device LLM an agent — the skill execution engine, native-to-WebView bridges, tool dispatch pipeline, and the chat UI that renders rich interactive AI outputs.

• You'll integrate pre-trained and pre-quantized models (Gemma family via LiteRT-LM on Android and MLX Swift on iOS), wire them to a modular skill system, and build the full agentic loop — all running locally on the device with zero cloud dependency.

• This is a live product with real users in 175 countries. You'll work directly with the founding team and ship features into existing Swift and Kotlin native codebases.

Key Requirements

• 5+ years of professional mobile engineering experience

• Strong experience with both Swift/SwiftUI or Kotlin/Jetpack Compose, with expertise to work across both platforms

• Experience with async/concurrent programming — Kotlin Coroutines and Swift Concurrency (async/await, actors)

• Hands-on experience with on-device ML inference SDKs — Apple MLX / MLX Swift, Core ML, LiteRT-LM, LiteRT (TensorFlow Lite), MediaPipe, or ExecuTorch

• Experience with small language models for on-device inference — Gemma, Qwen, Bonsai, LFM, Ministral 3, and similar

• Experience building native-to-WebView bridges — WKWebView on iOS or WebView + JavascriptInterface on Android

• Strong JavaScript proficiency — async/await, DOM APIs, fetch, Canvas, Web Audio

• Understanding of LLM function-calling, tool-use, skills and how tool schemas interact with model inference

• Familiarity with model quantization tradeoffs and on-device memory/latency constraints (you select and benchmark models, not train them)

Preferred

• Familiarity with agentic AI concepts — planning loops, function calling, tool use, multi-step reasoning

• Understanding of text embeddings, vector search, and semantic retrieval — ideally on-device using SQLite with vector extensions (sqlite-vec, SQLite-Vector), FAISS, or similar

• Experience designing RAG (Retrieval-Augmented Generation) pipelines — combining embedding models, vector indexing, and language model inference, ideally in an on-device or resource-constrained environment

• Familiarity with AI agent orchestration and infrastructure — systems that wire agents together, manage tool dispatch, memory, and multi-agent coordination, such as NullClaw, LangGraph, CrewAI, AutoGen, or similar — and understanding of how these patterns apply to on-device, resource-constrained environments

• Awareness of agentic protocols — MCP, A2A, AP2

• Startup or high-growth company experience preferred

Key Skills

• Track record of shipping on-device ML features on real devices — not just demos

• Track record of shipping complex mobile architectures involving embedded web engines, native bridges, dynamic UI rendering, and local model inference.

• Ability to design systems that span three async boundaries: model inference, native UI, and WebView JavaScript execution

• Systems thinking for agent architecture — tool registries, planning loops, state persistence, error recovery

• Performance intuition for on-device constraints — memory budgets, battery impact, context window limits, WebView lifecycle

• Ability to work autonomously on ambiguous problems

• Clear communicator who can explain architecture decisions to engineers and non-engineers

• Experience with AI development tools — Claude Code, Codex, Copilot — integrated into daily workflow

Responsibilities

• Own on-device AI architecture — drive model selection, inference pipeline design, and build-vs-buy decisions for all AI features

• Own the agentic runtime end-to-end — model integration, function-calling pipeline, skill execution engine, and rich chat UI on both iOS and Android

• Integrate pre-trained models using Apple MLX / MLX Swift on iOS and LiteRT-LM on Android

• Design and build the AI skill system — a modular architecture where the LLM discovers, loads, and executes skills that extend its capabilities with JavaScript logic, interactive UI, and native device actions

• Build the native-to-JavaScript bridge — a sandboxed WebView execution environment with bidirectional communication on both platforms

• Implement the chat rendering layer — heterogeneous message types including embedded interactive WebViews, images, progress panels, and native action confirmations

• Architect the async orchestration pipeline that coordinates LLM inference, tool execution, WebView JS, and UI updates

• Ship on-device AI features — local AI chat, tool calling, semantic search, smart suggestions, and agentic capabilities on the product roadmap

• Benchmark and select models for Skarvo's on-device requirements — evaluate new Gemma releases, quantization variants, and inference configurations

• Build agentic AI systems — design and implement on-device agents that can plan, call functions, use tools, and act on behalf of users — all locally with no cloud calls

• Design on-device RAG pipelines — build local embedding, vector indexing, and retrieval systems that power semantic search and context-aware AI features

• Collaborate with iOS and Android engineers — integrate ML pipelines into the native Swift codebase using Core ML, MLX Swift, and Swift Concurrency on iOS, and into the native Kotlin codebase using LiteRT-LM (TensorFlow Lite) and Kotlin Coroutines on Android

• Stay current — evaluate new models, frameworks, and techniques as the on-device AI landscape evolves rapidly

Tech Stack

• On-device inference (iOS): Apple MLX, MLX Swift, Core ML

• On-device inference (Android): LiteRT-LM, MediaPipe

• Models: Gemma family (Gemma 4 (E2B, E4B), FunctionGemma, EmbeddingGemma)

• iOS: Swift 6, SwiftUI, WKWebView, Swift Concurrency

• Android: Kotlin, Jetpack Compose, WebView, Kotlin Coroutines

• JavaScript: Vanilla JS, Web APIs, CDN library integration

• Agentic protocols: MCP, function-calling schemas

• Embeddings & search: sqlite-vec, on-device embedding models

• CI/CD: GitHub Actions

• Backend: Firebase (ephemeral relay only — zero server persistence)

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free