Cloud Engineer
Neurons Lab
About the role
About the project
Join Neurons Lab as a Cloud Engineer on a delivery engagement with a regulated EU BFSI enterprise (German-speaking client). The product is an AI / RAG-based enterprise productivity tool running in production across the client's internal teams. You will pick up a CDK-based codebase already deployed inside the client's AWS account, take over from the outgoing engineer, and own cloud delivery end-to-end: production hardening, security findings remediation, RAG infrastructure stability, and SSO/RBAC integration with the client's identity stack. This is a pure delivery role on a live, customer-managed AWS environment. Data protection is the single most important constraint on every architectural and operational decision. Reporting: AI Architect on the engagement; day-to-day collaboration with the AI Delivery Manager and ML Engineer.
Areas of Responsibility
- Own and extend the existing AWS CDK codebase deployed inside the client's AWS account.
- Operate the production stack: ECS Fargate, ECR, ALB (public + internal), VPC, CDN, S3, AWS Bedrock.
- Run the data layer: Postgres, Redis, vector database (Qdrant or similar), LLM observability (Langfuse or similar).
- Triage and remediate AWS Security Hub / Health Dashboard findings independently — the client expects us to handle this end-to-end.
- Integrate SSO and RBAC with the client's identity stack.
- Keep the RAG stack reliable as additional pilot teams onboard; partner with the ML Engineer on retrieval-quality incidents.
- Own cost tracking and capacity planning for the client's Bedrock + ECS spend.
- Document CDK constructs, runbooks, and incident playbooks so handover to the next engineer takes days, not weeks.
Skills
- Advanced AWS CDK (primary) — must be able to extend an existing CDK codebase from day one, not just author from scratch.
- AWS Bedrock hands-on experience — model invocation patterns, IAM scoping, cost monitoring.
- ECS Fargate in production: task definitions, service auto-scaling, ALB target groups, blue/green or rolling deploys.
- Networking: VPC design, public/private ALB patterns, CloudFront, private subnet egress.
- RAG-stack ops: deploying and operating a vector database, Postgres (RDS/Aurora), Redis (ElastiCache), and an LLM observability layer on AWS.
- AWS Security Hub / Inspector / Health Dashboard — finding triage and remediation in restricted client environments.
- Python — FastAPI backends, MLOps automation, deployment glue.
- Identity & access: SSO (Okta / Azure AD / Cognito), RBAC, IAM least-privilege design.
- Terraform — secondary; useful for modules supplied by the client's IT team.
- Working in restricted client AWS accounts — limited permissions, async approvals, wiki/docs-portal handovers.
- Communication: clear written and verbal English. German is a strong plus, not required.
Knowledge
- AWS Certified Solutions Architect — Associate or Professional (required), or AWS Certified DevOps Engineer — Professional.
- Working knowledge of AWS Well-Architected framework, especially Security and Reliability pillars applied to BFSI.
- Familiarity with EU AI Act obligations relevant to RAG / GenAI products.
- GDPR fundamentals as they apply to credentials, logs, and EU data residency.
Experience
- 5+ years in cloud / DevOps / cloud engineering, with 2+ years of hands-on AWS CDK in production.
- 2+ years operating AI/ML or GenAI workloads on AWS (Bedrock, SageMaker, or comparable).
- Direct experience deploying inside a regulated client's AWS account (BFSI, healthcare, government, or similar) — not just internal sandbox environments.
- Track record of stepping into an existing codebase mid-project and shipping within 1–2 weeks.
- Comfortable being the only Cloud Engineer on a small (3–4 person) delivery team.
This offer from "Neurons Lab" has been enriched by Jobgether.com and got a 82% flex score.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free