Skip to content
mimi

SRE Architect (Splunk | Linux | Python

Data Nexus AI

New York · On-site Contract Senior 3w ago

About the role

Key Responsibilities

  • Design, implement, and maintain enterprise observability solutions using Splunk Enterprise including dashboards, alerts, and data ingestion pipelines
  • Develop and enhance monitoring frameworks for infrastructure, applications, and web platforms
  • Automate operational processes using Linux shell scripting and Python
  • Implement intelligent alerting strategies to reduce noise and improve incident response efficiency
  • Provide L3 production support for business-critical applications and infrastructure
  • Support cloud and containerized deployments across AWS and Kubernetes environments
  • Collaborate with engineering teams to standardize logging and telemetry practices
  • Drive root cause analysis, post-incident reviews, and continuous reliability improvements
  • Build operational runbooks, disaster recovery procedures, and service continuity plans
  • Integrate monitoring and deployment workflows with CI/CD tools such as Jenkins, Git, and TeamCity
  • Support database monitoring and performance analysis across SQL Server, Oracle, DB2, and MySQL platforms
  • Participate in ITIL-based change, incident, and problem management processes

Required Skills

  • Strong hands-on expertise in Splunk engineering, administration, and architecture
  • Advanced experience in Linux / Unix environments
  • Proficiency in Python, Shell scripting, and automation frameworks
  • Experience with AWS cloud services and Kubernetes / Docker platforms
  • Knowledge of monitoring tools such as Nagios and custom observability solutions
  • Experience supporting high-availability web platforms and distributed systems
  • Strong troubleshooting and production incident management skills
  • Understanding of CI/CD pipelines and deployment automation
  • Familiarity with ITIL processes and service management tools like ServiceNow

Preferred Qualifications

  • Splunk certifications (Power User / Admin / Architect)
  • Experience building large-scale telemetry platforms
  • Background in financial services or high-transaction enterprise environments
  • Experience designing intelligent alerting and automated incident workflows

Experience Level

  • 15+ years in production engineering / SRE / observability roles
  • Prior experience supporting mission-critical enterprise systems

Skills

AWSDB2DockerGitJenkinsKubernetesLinuxMySQLNagiosOraclePythonSQL ServerSplunkTeamCityUnix

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free