Mobile AI Engineer - Swift & Kotlin - India
Skarvo
About the role
Your Location and Job
• You will work in a remote-first work environment from anywhere in India.
• One year full-time contract with a possibility to move into a full-time employee.
• We offer competitive salary.
• Founders of Skarvo previously worked at Apple as senior engineers and designers. They are graduates of MIT.
• Skarvo founders are located in Silicon Valley, California.
The Role
• We're looking for a Mobile AI Engineer expert in Swift and Kotlin to build the runtime infrastructure that makes our on-device LLM an agent — the skill execution engine, native-to-WebView bridges, tool dispatch pipeline, and the chat UI that renders rich interactive AI outputs.
• You'll integrate pre-trained and pre-quantized models (Gemma family via LiteRT-LM on Android and MLX Swift on iOS), wire them to a modular skill system, and build the full agentic loop — all running locally on the device with zero cloud dependency.
• This is a live product with real users in 175 countries. You'll work directly with the founding team and ship features into existing Swift and Kotlin native codebases.
Key Requirements
• 5+ years of professional mobile engineering experience
• Strong experience with both Swift/SwiftUI or Kotlin/Jetpack Compose, with expertise to work across both platforms
• Experience with async/concurrent programming — Kotlin Coroutines and Swift Concurrency (async/await, actors)
• Hands-on experience with on-device ML inference SDKs — Apple MLX / MLX Swift, Core ML, LiteRT-LM, LiteRT (TensorFlow Lite), MediaPipe, or ExecuTorch
• Experience with small language models for on-device inference — Gemma, Qwen, Bonsai, LFM, Ministral 3, and similar
• Experience building native-to-WebView bridges — WKWebView on iOS or WebView + JavascriptInterface on Android
• Strong JavaScript proficiency — async/await, DOM APIs, fetch, Canvas, Web Audio
• Understanding of LLM function-calling, tool-use, skills and how tool schemas interact with model inference
• Familiarity with model quantization tradeoffs and on-device memory/latency constraints (you select and benchmark models, not train them)
Preferred
• Familiarity with agentic AI concepts — planning loops, function calling, tool use, multi-step reasoning
• Understanding of text embeddings, vector search, and semantic retrieval — ideally on-device using SQLite with vector extensions (sqlite-vec, SQLite-Vector), FAISS, or similar
• Experience designing RAG (Retrieval-Augmented Generation) pipelines — combining embedding models, vector indexing, and language model inference, ideally in an on-device or resource-constrained environment
• Familiarity with AI agent orchestration and infrastructure — systems that wire agents together, manage tool dispatch, memory, and multi-agent coordination, such as NullClaw, LangGraph, CrewAI, AutoGen, or similar — and understanding of how these patterns apply to on-device, resource-constrained environments
• Awareness of agentic protocols — MCP, A2A, AP2
• Startup or high-growth company experience preferred
Key Skills
• Track record of shipping on-device ML features on real devices — not just demos
• Track record of shipping complex mobile architectures involving embedded web engines, native bridges, dynamic UI rendering, and local model inference.
• Ability to design systems that span three async boundaries: model inference, native UI, and WebView JavaScript execution
• Systems thinking for agent architecture — tool registries, planning loops, state persistence, error recovery
• Performance intuition for on-device constraints — memory budgets, battery impact, context window limits, WebView lifecycle
• Ability to work autonomously on ambiguous problems
• Clear communicator who can explain architecture decisions to engineers and non-engineers
• Experience with AI development tools — Claude Code, Codex, Copilot — integrated into daily workflow
Responsibilities
• Own on-device AI architecture — drive model selection, inference pipeline design, and build-vs-buy decisions for all AI features
• Own the agentic runtime end-to-end — model integration, function-calling pipeline, skill execution engine, and rich chat UI on both iOS and Android
• Integrate pre-trained models using Apple MLX / MLX Swift on iOS and LiteRT-LM on Android
• Design and build the AI skill system — a modular architecture where the LLM discovers, loads, and executes skills that extend its capabilities with JavaScript logic, interactive UI, and native device actions
• Build the native-to-JavaScript bridge — a sandboxed WebView execution environment with bidirectional communication on both platforms
• Implement the chat rendering layer — heterogeneous message types including embedded interactive WebViews, images, progress panels, and native action confirmations
• Architect the async orchestration pipeline that coordinates LLM inference, tool execution, WebView JS, and UI updates
• Ship on-device AI features — local AI chat, tool calling, semantic search, smart suggestions, and agentic capabilities on the product roadmap
• Benchmark and select models for Skarvo's on-device requirements — evaluate new Gemma releases, quantization variants, and inference configurations
• Build agentic AI systems — design and implement on-device agents that can plan, call functions, use tools, and act on behalf of users — all locally with no cloud calls
• Design on-device RAG pipelines — build local embedding, vector indexing, and retrieval systems that power semantic search and context-aware AI features
• Collaborate with iOS and Android engineers — integrate ML pipelines into the native Swift codebase using Core ML, MLX Swift, and Swift Concurrency on iOS, and into the native Kotlin codebase using LiteRT-LM (TensorFlow Lite) and Kotlin Coroutines on Android
• Stay current — evaluate new models, frameworks, and techniques as the on-device AI landscape evolves rapidly
Tech Stack
• On-device inference (iOS): Apple MLX, MLX Swift, Core ML
• On-device inference (Android): LiteRT-LM, MediaPipe
• Models: Gemma family (Gemma 4 (E2B, E4B), FunctionGemma, EmbeddingGemma)
• iOS: Swift 6, SwiftUI, WKWebView, Swift Concurrency
• Android: Kotlin, Jetpack Compose, WebView, Kotlin Coroutines
• JavaScript: Vanilla JS, Web APIs, CDN library integration
• Agentic protocols: MCP, function-calling schemas
• Embeddings & search: sqlite-vec, on-device embedding models
• CI/CD: GitHub Actions
• Backend: Firebase (ephemeral relay only — zero server persistence)
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free