Staff Software Engineer
Refinitiv
About the role
Overview of the Role
Advanced Content Engineering (ACE) is seeking a Staff Software Engineer to lead the design and delivery of the search platform’s control-plane API and cloud infrastructure. The platform’s core promise is self-service: internal client teams must be able to create a search system, configure an ingestion topology, promote a new index to production, and monitor system health — entirely through APIs — without requiring direct involvement from the platform team. Building, operating, and continuously improving that self-service experience is the heart of this role. This is a high-ownership, high-leverage position at the intersection of platform engineering, API design, and cloud infrastructure. Staff Engineers on this team define, build, test, deploy, scale, and operate what they ship — full-stack ownership is the baseline, not a bonus. Delivery friction is treated as an urgent engineering problem: the team ships to production constantly, AI-assisted development is the norm, and removing obstacles to fast, safe delivery is everyone’s responsibility. The successful candidate brings enterprise-grade security instincts, deep AWS expertise, and a product-minded approach to developer experience — treating the platform’s API as a product in its own right.
About the Role
In this position, you will focus on:
Platform Control-Plane API
- Plan, design, develop, and own the platform’s management API — the self-service interface through which client teams create and configure search systems, manage ingestion topologies, register reusable components, promote index versions, and monitor system health — resolving problems of diverse scope with innovative thinking and little or no precedent to guide solutions
- Architect the platform’s multi-tenant access model: implement strict data isolation between client tenants, integrate with enterprise identity providers, establish role-based access control across all API endpoints, and define the governance framework that ensures the platform can make credible security commitments to enterprise customers
- Establish API strategy and cross-system integration patterns — designing versioned, backward-compatible interfaces with clear contracts, comprehensive documentation, and developer-experience patterns drawn from best-in-class search platform providers — and set governance standards that the team follows for all future API surface
- Design and expose the API surface required to support the platform’s evaluation and experimentation workflows — including endpoints that enable the search grading tool to consume experiment run outputs, query/result pairs, and relevance judgments, and that allow client teams to configure and trigger A/B search experiments through self-service interfaces
- Design the configuration data model and persistence layer (DynamoDB and related services) that stores search system definitions, component registry entries, index lifecycle state, and audit logs — applying architectural patterns that scale to the platform’s multi-tenant and multi-region ambitions
- Break down complex business requirements into functional and technical requirements with consideration for security, ethical AI implementation, and operational efficiency; contribute to recommendations where technology transformation can spark business growth
Cloud Infrastructure & DevOps
- Own the platform’s AWS infrastructure as code — defining, provisioning, and maintaining ECS services, MSK clusters, OpenSearch/Vespa deployments, DynamoDB tables, networking (VPC, security groups, NAT), and IAM roles using Terraform or AWS CDK — establishing infrastructure governance standards and a cloud strategy for multi-environment and eventual multi-region operation
- Design and own the CI/CD pipeline for platform services — establishing DevOps culture and toolchain strategy for the team, with a clear mandate to eliminate delivery friction: the team ships to production constantly, and any obstacle to doing so safely is an engineering problem to be solved, not a process to be accepted
- Drive adoption of AI-assisted development practices across the team’s infrastructure and API work — establishing the tooling, patterns, and norms that enable engineers to leverage AI to move faster while maintaining the quality and reliability bar the platform demands
- Own infrastructure cost management: monitor AWS spend across platform components, evaluate architectural trade-offs at the system level, and implement an enterprise performance and optimization framework that keeps the platform’s economics sustainable as it scales — including compute cost governance for inference workloads as custom model serving is introduced
- Implement and operate customer-controlled encryption key (CMK) support — applying security strategy, risk assessment frameworks, and security governance to give enterprise clients control over their encryption keys while preserving multi-tenant reliability
Reliability Engineering
- Define and own platform-level SLOs covering API availability, query latency, ingestion throughput, and end-to-end document freshness — and build the monitoring infrastructure (CloudWatch, distributed tracing, alerting) that makes SLO compliance continuously visible to the team and to client teams
- Design the observability infrastructure for agentic retrieval paths — where standard request/response logging is insufficient: implement trace-level instrumentation that captures tool invocation sequences, per-hop latency, and retrieval inputs, enabling reliable diagnosis of failures and quality regressions in non-deterministic agent workflows
- Take full operational responsibility for platform API and infrastructure — you built it, you own it, you run it: triage and resolve incidents, write thorough post-mortems, and drive systematic improvements that prevent recurrence
- Design enterprise performance strategy for the platform’s API layer: load testing, capacity planning, performance profiling, and system-level optimization — ensuring the platform can handle planned growth in tenants, content volumes, and query traffic
- Embed security architecture throughout the platform’s infrastructure: least-privilege IAM, secrets management, encryption at rest and in transit, audit logging, and compliance implementation aligned with TR’s enterprise security requirements
Technical Leadership
- Establish architectural principles and cross-system design patterns for the platform’s control plane and infrastructure — functioning as the technical authority that other engineers and teams turn to for API and infrastructure guidance
- Lead significant projects and business initiatives that span multiple engineers and interact with partner teams; determine work priorities and make adjustments to short-term priorities while maintaining strategic focus; provide specialist advice to senior management on complex infrastructure and security issues
- Mentor and develop Senior and mid-level engineers — providing coaching, technical direction, and educational opportunities in cloud infrastructure, platform API design, reliability engineering, and AI-assisted development practices
- Engage with client teams as a technical partner — understanding their integration experience and pain points, feeding structured requirements back into the platform API roadmap, and proactively reducing time-to-value for new platform adopters
- Deliver effective presentations on complex infrastructure and security concepts to technical and non-technical stakeholders; champion ethical AI practices and responsible technology deployment across the team’s work
About You
You’re an ideal fit if you have:
Required Experience
- Bachelor’s or Master’s degree
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free