Cloud Engineer – AWS, Security, CI/CD & Infrastructure Ownership
Upwork
About the role
Cloud Engineer – AWS, Security, CI/CD & Infrastructure Ownership (No Agencies)
Type: Full-time, Long-term, Remote
No agencies — individual applicants only
On your application, please include your actual bid (monthly salary).
About the Role
We’re hiring a Cloud Engineer to own our cloud infrastructure end-to-end.
- This is not a support role and not a “just deploy things” position.
- You will be responsible for architecture, reliability, security, scalability, and cost efficiency across our entire infrastructure.
- You will also act as a technical bridge between internal systems and external partners, ensuring secure and reliable integrations.
- We are moving toward a more structured security posture (ISO 27001 direction), so systems, processes, and decisions should reflect auditability, least privilege, and operational discipline.
If systems go down, are misconfigured, insecure, or too expensive — you own it.
If infrastructure does not scale — you redesign it.
If alerting is noisy or missing — you fix it.
If integrations are fragile or unsafe — you standardize and secure them.
This is a fast‑moving startup environment with high expectations and real ownership.
Responsibilities
Cloud Ownership & Architecture
- Own and evolve the entire AWS infrastructure
- Design systems that are:
- Scalable
- Fault‑tolerant
- Cost‑efficient
- Make and document infrastructure decisions and tradeoffs
- Continuously improve system design as scale increases
Infrastructure & Deployment
- Manage and operate:
- EC2, RDS, S3, CloudFront, VPC
- Implement and maintain Infrastructure as Code (Terraform preferred)
- Ensure environments are:
- Reproducible
- Consistent across staging / production
- Own deployment strategy and infrastructure lifecycle
CI/CD & Automation
- Design and maintain CI/CD pipelines
- Automate:
- Deployments
- Rollbacks
- Environment provisioning
- Improve deployment reliability and speed
- Integrate testing, validation, and safeguards into pipelines
Monitoring, Alerting & Reliability
- Own system observability:
- Metrics
- Logs
- Alerts
- Work with tools such as:
- CloudWatch
- Prometheus
- Grafana
- Build dashboards that are:
- Actionable (not vanity metrics)
- Tied to real system health and business impact
- Ensure:
- Fast detection of issues
- Clear, high‑signal alerts (minimal noise)
- Lead incident response and post‑mortems
Networking
- Own AWS networking setup:
- VPCs, subnets, routing
- Load balancers
- Security groups
- Ensure secure and efficient communication between services
- Debug networking issues in production environments
Security (Critical Responsibility)
- Enforce best‑in‑class security practices across infrastructure and development workflows
- Define and enforce:
- IAM roles and policies (least privilege)
- Secrets management standards
- Encryption (at rest and in transit)
- Establish secure development patterns:
- Credential handling
- Access boundaries between services
- Safe CI/CD practices
- Design systems with auditability and traceability in mind
- Align infrastructure and processes toward ISO 27001‑style practices:
- Access control discipline
- Change tracking
- Clear ownership and documentation
- Identify and mitigate vulnerabilities proactively
- Continuously improve overall security posture
External Integrations & Technical Coordination
- Act as the technical point of contact for external partners and systems
- Work with external technical teams to:
- Enable secure system‑to‑system communication
- Define integration patterns (APIs, networking, auth)
- Troubleshoot cross‑system issues
- Ensure integrations are:
- Reliable
- Secure
- Observable
- Standardize how external systems connect to internal infrastructure
Cost Optimization
- Monitor and optimize AWS costs
- Identify inefficiencies and reduce waste
- Balance performance vs cost tradeoffs
- Build systems that scale efficiently, not just functionally
Troubleshooting & Ownership
- Debug and resolve production issues
- Identify root causes (not just patch symptoms)
- Implement long‑term fixes and improvements
Required Skills (Non‑Negotiable)
- Strong hands‑on experience with AWS in production environments
- Proven experience owning infrastructure end‑to‑end
- Terraform (or equivalent IaC) — real‑world usage
- Strong experience with:
- CI/CD systems (GitHub Actions, GitLab CI, etc.)
- Linux systems
- Solid understanding of:
- Networking (VPC, routing, load balancing)
- Security best practices (IAM, secrets, encryption)
- Experience with:
- Monitoring / alerting systems
- Production debugging
- Working with external systems / APIs / integrations
- Python (or scripting) for automation and tooling
- Experience with Docker (Kubernetes/EKS is a plus)
- Strong ownership mindset
- Clear communication with internal and external technical stakeholders
- Comfortable in fast‑paced startup environments
Nice‑to‑Have
- Experience with:
- Grafana (dashboards, alerting, observability design)
- Multi‑account AWS setups
- High‑scale or real‑time systems
- Kubernetes / EKS
- Exposure to:
- ISO 27001 or similar security/compliance frameworks
- Experience improving:
- Infrastructure cost efficiency at scale
- Security posture in production systems
🚀 This Role Is (and Isn’t)
✅ You own cloud infrastructure
✅ You design systems, not just deploy them
✅ You improve reliability, security, and cost efficiency
✅ You handle real‑world integrations with external systems
🚫 Not a junior role
🚫 Not a DevOps “ticket executor”
🚫 Not a role with constant hand‑holding
🕐 Availability
- 6 AM – 10 AM EST required
- Often extended until 12 PM EST
- Flexibility is essential
💰 Compensation
- Monthly salary (based on experience and ownership level)
- Performance‑ and profitability‑based bonuses possible
💬 If you’ve owned infrastructure, enforced strong security practices, and operate with the discipline required for compliant systems — this role is for you.
Requirements
- Strong hands-on experience with AWS in production environments
- Proven experience owning infrastructure end-to-end
- Terraform (or equivalent IaC) — real-world usage
- Strong experience with CI/CD systems (GitHub Actions, GitLab CI, etc.)
- Strong experience with Linux systems
- Solid understanding of networking (VPC, routing, load balancing)
- Solid understanding of security best practices (IAM, secrets, encryption)
- Experience with monitoring / alerting systems
- Experience with production debugging
- Experience working with external systems / APIs / integrations
- Python (or scripting) for automation and tooling
- Experience with Docker
- Strong ownership mindset
- Clear communication with internal and external technical stakeholders
- Comfortable in fast-paced startup environments
Responsibilities
- Own and evolve the entire AWS infrastructure
- Design systems that are scalable, fault-tolerant, and cost-efficient
- Make and document infrastructure decisions and tradeoffs
- Continuously improve system design as scale increases
- Manage and operate EC2, RDS, S3, CloudFront, VPC
- Implement and maintain Infrastructure as Code (Terraform preferred)
- Ensure environments are reproducible and consistent across staging / production
- Own deployment strategy and infrastructure lifecycle
- Design and maintain CI/CD pipelines
- Automate deployments, rollbacks, and environment provisioning
- Improve deployment reliability and speed
- Integrate testing, validation, and safeguards into pipelines
- Own system observability: metrics, logs, alerts
- Work with tools such as CloudWatch, Prometheus, Grafana
- Build actionable dashboards tied to real system health and business impact
- Ensure fast detection of issues and clear, high-signal alerts
- Lead incident response and post-mortems
- Own AWS networking setup: VPCs, subnets, routing, load balancers, security groups
- Ensure secure and efficient communication between services
- Debug networking issues in production environments
- Enforce best-in-class security practices across infrastructure AND development workflows
- Define and enforce IAM roles and policies (least privilege), secrets management standards, and encryption (at rest and in transit)
- Establish secure development patterns: credential handling, access boundaries between services, safe CI/CD practices
- Design systems with auditability and traceability in mind
- Align infrastructure and processes toward ISO 27001-style practices
- Identify and mitigate vulnerabilities proactively
- Continuously improve overall security posture
- Act as the technical point of contact for external partners and systems
- Work with external technical teams to enable secure system-to-system communication
- Define integration patterns (APIs, networking, auth)
- Troubleshoot cross-system issues
- Ensure integrations are reliable, secure, and observable
- Standardize how external systems connect to internal infrastructure
- Monitor and optimize AWS costs
- Identify inefficiencies and reduce waste
- Balance performance vs cost tradeoffs
- Build systems that scale efficiently, not just functionally
- Debug and resolve production issues
- Identify root causes (not just patch symptoms)
- Implement long-term fixes and improvements
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free