Skip to content
mimi

AI Solution Architect- REMOTE

Sierra Solutions

Bridgewater · On-site Contract Lead 3d ago

About the role

Job Summary

We are seeking a highly experienced AI Solutions Architect to lead the design and delivery of end-to-end Generative AI solutions that are scalable, secure, and cost‑efficient. This role is responsible for translating complex business requirements into robust technical architectures across the full lifecycle—from problem definition and data strategy to model selection, deployment, and ongoing optimization. The ideal candidate brings deep expertise in GenAI approaches such as RAG, fine‑tuning, and agentic workflows, along with strong experience in LLM evaluation, prompt architecture, and high‑performance serving patterns. You will play a key role in defining LLMOps best practices, implementing governance and safety frameworks, and integrating AI solutions into enterprise ecosystems. As a technical leader, you will drive architecture standards, mentor teams, and guide build‑versus‑buy decisions while continuously improving solution quality, reliability, and performance in production environments.

Primary Responsibilities

  • Architect & Design end to end Generative AI solutions by translating business requirements into scalable, secure, and cost effective technical architectures
  • Own solution design across the full lifecycle, including problem definition, architectural choices, data strategy, model selection, integration, deployment, and operations
  • Define and document solution architectures using architecture diagrams, sequence flows, deployment topologies, and technical design documents
  • Select appropriate GenAI approaches (RAG, fine tuning, prompt engineering, agentic workflows, routing, multimodal) based on accuracy, latency, cost, governance, and data constraints
  • Architect Retrieval Augmented Generation (RAG) solutions, including document ingestion, chunking strategies, embedding models, hybrid search, re ranking, caching, and freshness management
  • Design and implement agentic systems using tool/function calling, planner executor and multi agent patterns, with reliability, observability, and failure handling
  • Define prompt architecture patterns such as ReAct, Chain of Thought alternatives, structured output prompting, skill routing, and template versioning
  • Evaluate and integrate LLMs from both closed providers and open weight models, including model routing, fallback logic, and token cost optimization
  • Design fine tuning strategies (LoRA/QLoRA, PEFT, adapters) when prompting or RAG alone is insufficient, and define evaluation criteria for fine tuned models
  • Architect low latency, high throughput serving patterns using batching, caching, speculative decoding, quantization, and GPU/CPU routing
  • Design APIs, SDKs, and reusable platform components to enable consistent AI adoption across multiple teams and applications
  • Integrate GenAI solutions with enterprise systems including identity, authorization, data platforms, event streams, and front end applications
  • Define LLMOps practices covering prompt versioning, dataset management, experiment tracking, evaluation pipelines, regression testing, and observability
  • Establish monitoring and tracing for quality, hallucinations, safety violations, latency, throughput, and cost across AI workflows
  • Design safety, governance, and guardrail mechanisms including PII redaction, content filtering, prompt isolation, jailbreak defense, and audit logging
  • Conduct feasibility and risk assessments covering data readiness, compliance, security exposure, vendor lock‑in, performance, and operational risk
  • Lead technical discovery workshops and architecture reviews with product, engineering, data, and security stakeholders
  • Mentor engineers and teams on GenAI patterns, solution design tradeoffs, and architectural best practices
  • Evaluate tools, frameworks, and vendors, and drive build vs buy decisions based on technical and business constraints
  • Maintain reference architectures, design standards, architectural decision records (ADRs), and reusable blueprints
  • Drive continuous improvement of AI solution quality, reliability, scalability, and cost efficiency in production environments

Education and Experience

  • Person should have excellent design experience with designing in‑house systems (connecting with different ERP systems, connecting the data with other systems, designing data model) – minimum 10 years (5 years solution architect, 4‑5 years Machine learning & AI experience)
  • GenAI frameworks: LangChain, LangGraph, AutoGen, Microsoft Agent Framework.
  • Model providers: Azure OpenAI, Anthropic, Google, AWS Bedrock, Snowflake Cortex AI, AgentCore
  • Retrieval & Vector DBs: OpenSearch, Pinecone, pgvector, Weaviate, GraphDBs (Neo4j, Neptune)
  • Serving Stack: vLLM, TGI, Ray Serve
  • LLMOps: Dataiku, MLflow, LangSmith, Weights & Biases
  • Safety: Azure AI Content Safety, Guardrails, NeMo Guardrails
  • Agents: LangGraph, AutoGen, CrewAI

Requirements

  • excellent design experience with designing in-house systems (connecting with different ERP systems, connecting the data with other systems, designing data model)

Responsibilities

  • Architect & Design end to end Generative AI solutions by translating business requirements into scalable, secure, and cost effective technical architectures
  • Own solution design across the full lifecycle, including problem definition, architectural choices, data strategy, model selection, integration, deployment, and operations
  • Define and document solution architectures using architecture diagrams, sequence flows, deployment topologies, and technical design documents
  • Select appropriate GenAI approaches (RAG, fine tuning, prompt engineering, agentic workflows, routing, multimodal) based on accuracy, latency, cost, governance, and data constraints
  • Architect Retrieval Augmented Generation (RAG) solutions, including document ingestion, chunking strategies, embedding models, hybrid search, re ranking, caching, and freshness management
  • Design and implement agentic systems using tool/function calling, planner executor and multi agent patterns, with reliability, observability, and failure handling
  • Define prompt architecture patterns such as ReAct, Chain of Thought alternatives, structured output prompting, skill routing, and template versioning
  • Evaluate and integrate LLMs from both closed providers and open weight models, including model routing, fallback logic, and token cost optimization
  • Design fine tuning strategies (LoRA/QLoRA, PEFT, adapters) when prompting or RAG alone is insufficient, and define evaluation criteria for fine tuned models
  • Architect low latency, high throughput serving patterns using batching, caching, speculative decoding, quantization, and GPU/CPU routing
  • Design APIs, SDKs, and reusable platform components to enable consistent AI adoption across multiple teams and applications
  • Integrate GenAI solutions with enterprise systems including identity, authorization, data platforms, event streams, and front end applications
  • Define LLMOps practices covering prompt versioning, dataset management, experiment tracking, evaluation pipelines, regression testing, and observability
  • Establish monitoring and tracing for quality, hallucinations, safety violations, latency, throughput, and cost across AI workflows
  • Design safety, governance, and guardrail mechanisms including PII redaction, content filtering, prompt isolation, jailbreak defense, and audit logging
  • Conduct feasibility and risk assessments covering data readiness, compliance, security exposure, vendor lock in, performance, and operational risk
  • Lead technical discovery workshops and architecture reviews with product, engineering, data, and security stakeholders
  • Mentor engineers and teams on GenAI patterns, solution design tradeoffs, and architectural best practices
  • Evaluate tools, frameworks, and vendors, and drive build vs buy decisions based on technical and business constraints
  • Maintain reference architectures, design standards, architectural decision records (ADRs), and reusable blueprints
  • Drive continuous improvement of AI solution quality, reliability, scalability, and cost efficiency in production environments.

Skills

AutoGenAWS BedrockAzure AI Content SafetyAzure OpenAICrewAIDataikuGoogleGraphDBsGuardrailsLangChainLangGraphLangSmithLLMOpsMLflowMicrosoft Agent FrameworkNeo4jNeptuneOpenSearchPineconepgvectorRAGRay ServeSnowflake Cortex AITGIvLLMWeaviateWeights & Biases

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free