All jobs · Product Designer jobs

I

Platform Design & Engineering

IBM

San Jose · On-site Full-time Senior 1mo ago

Apply with a tailored resume Save job

About the role

About IBM Software

At IBM Software, we transform client challenges into solutions. Building the world’s leading AI-powered, cloud-native products that shape the future of business and society. Our legacy of innovation creates endless opportunities for IBMers to learn, grow, and make an impact on a global scale. Working in Software means joining a team fueled by curiosity and collaboration. You’ll work with diverse technologies, partners, and industries to design, develop, and deliver solutions that power digital transformation. With a culture that values innovation, growth, and continuous learning, IBM Software places you at the heart of IBM’s product and technology landscape. Here, you’ll have the tools and opportunities to advance your career while creating software that changes the world.

Your Role And Responsibilities

Platform Design & Engineering

Design, build, and maintain a scalable AI Platform that supports multiple engineering teams in delivering natural language conversation, RAG-based retrieval, and AI-driven data solutions.
Develop core platform services including LLM routing, model abstraction layers, prompt management, and inference orchestration across cloud and on-premise infrastructure.
Architect and implement RAG pipelines — including vector store integration, document ingestion, chunking strategies, and retrieval optimization — enabling teams to ground AI responses in enterprise data.
Build secure, governed data access patterns that allow AI agents and models to query complex structured and unstructured data sources safely and efficiently.

AI Agent & Agentic Framework Development

Engineer agentic capabilities including multi-step reasoning, tool use, and agent-to-agent (A2A) coordination patterns that empower downstream teams to deliver autonomous AI workflows.
Implement and maintain MCP (Model Context Protocol) server registrations, enabling standardized tool discovery and invocation across the platform ecosystem.
Contribute to the design of circuit breaking, retry logic, and guardrail mechanisms that ensure safe and reliable agentic behavior in production environments.

Platform Enablement & Developer Experience

Partner with engineering teams across the organization to understand their AI delivery needs and translate them into platform capabilities, SDKs, and reusable components.
Develop and maintain self-service tooling, APIs, and documentation that enable product engineers to integrate AI capabilities without deep platform expertise.
Establish and enforce platform engineering standards around security, observability, cost management, and AI governance to ensure responsible AI delivery at scale.

Data & AI Intelligence

Build and maintain AI-driven pipelines that process complex customer data to identify, surface, and deliver actionable business value through intelligent automation and insight generation.
Collaborate with data scientists to productionize models and analytical workflows, ensuring seamless integration with platform data infrastructure including data lakes, warehouses, and streaming systems.
Instrument platform telemetry and evaluation frameworks to measure AI system quality, latency, cost, and business impact across consuming teams.

Technical Leadership & Collaboration

Serve as a technical leader and trusted partner across principal engineers, staff engineers, and data science disciplines — driving alignment on platform architecture and engineering standards.
Participate in design reviews, threat modeling, and architectural decision-making, advocating for scalable, maintainable, and secure platform patterns.
Mentor mid-level engineers through code reviews, pairing sessions, and technical guidance, raising the engineering bar across the broader platform team.

Preferred Education

Master's Degree

Required Technical And Professional Expertise

5+ years of professional software development experience, with demonstrated depth in backend platform or infrastructure engineering with proven experience designing and building distributed systems or platform-level services that serve multiple internal engineering teams.
Hands-on experience with large language model (LLM) integration, including prompt engineering, model API consumption, and managing inference pipelines in production.
Strong proficiency in Python and/or Java/Go, with demonstrated ability to engineer production-quality, maintainable, and well-tested code with a solid understanding of RESTful API design, event-driven architecture, and asynchronous processing patterns as they apply to AI platform services.
Experience with major cloud platforms (AWS preferred) and the services relevant to AI/ML workloads — including managed compute, storage, and model serving infrastructure.
Experience working with AI orchestration frameworks such as LangChain, LangGraph, LlamaIndex, or equivalent agentic tooling.

Preferred Technical And Professional Experience

Experience with MCP (Model Context Protocol) or A2A (Agent-to-Agent) protocol design and implementation within multi-agent AI systems.
Hands-on experience with AWS Bedrock, Azure AI Foundry, or watsonx as a managed AI platform for model hosting, fine-tuning, or inference routing.
Familiarity with LiteLLM, OpenRouter, or similar LLM proxy/routing layers for abstracting multi-model inference across providers.
Experience with Snowflake, including Snowpark, Cortex AI features, or Time Travel, as part of a data platform or AI analytics workflow.
Background in IBM enterprise platforms including Apptio, Cloudability, or IBM ContextForge, with awareness of how AI augments financial and cloud cost management use cases.
Knowledge of AI governance, responsible AI practices, and security controls for AI systems — including data privacy, access control, and output guardrails.
Experience with observability tooling applied to AI systems — including LLM evaluation frameworks, token cost tracking, latency profiling, and quality metrics pipelines.
Exposure to AI compliance requirements (e.g., FIPS, SOC 2, FedRAMP) and how they shape platform architecture decisions in regulated enterprise environments.
Contributions to open-source AI tooling, published technical writing, or demonstrated thought leadership in the generative AI or ML platform space.
Experience building internal developer platforms (IDPs) or platform-as-product models where the primary customer is an internal engineering audience.

Skills

AWSGoJavaLangGraphLangChainLLMLlamaIndexPythonRAG

Similar roles

MCP Engineer / AI Backend Engineer

Ruby Labs

Software Engineer

Google

$147k – $211k/yr

Senior Database Engineer

Glencore AG

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free