A
Core L3 Engineer
AceStack
Montreal · Hybrid Full-time Senior 1mo ago
About the role
Role - Core L3 Engineer
Location - Montreal, QB (3 days office)
Experience - 8+ Years
Type - Full Time Permanent
Requirements
- Advanced Linux / Unix support experience required.
- Strong shell scripting and python programming skills for SRE related activities required.
- Experience on using Splunk OR Grafana/Prometheus/Loki stack required, preferably both.
- General understanding on Veritas Cluster Service, Load Balancers, and VMWare required.
- Knowledge on ITIL principles required.
- Effective oral and written communication skills, and interpersonal skills to work well in a team environment required.
- Strong organizational and coordination skills with the ability to manage multiple tasks and high-pressure situations for outage handling, management, or resolution.
- Be available for weekend work.
Highly Desired:
- Experience in application support, code release and liaison with development teams highly desired.
- Experience on automation with Ansible playbooks highly desired.
- Experience on Ansible Automation Platform administration highly desired.
- Experience on Terraform, especially Terraform Enterprise highly desired.
- Knowledge on Dockers, Kubernetes/OpenShift highly desired.
Preferred:
- Experience in development tool chain such as git, bitbucket and CI/CD tools preferred.
- Experience in Agile methodologies preferred.
- Good knowledge on JVMs and its garbage collection mechanisms preferred.
- Experience on relational databases
Experience Required
- 8+ years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting BI platforms with VMWare and LB engineering knowledge.
Roles & Responsibilities
- Manage and support a variety of applications developed in-house for purposes like application management and application coordination using Apache Zookeeper, API Proxy, Automation Platform using Ansible Automation Platform and Infrastructure as Code using Terraform.
- Act as the highest level of escalation and actively engages engineering teams who develops the products and tooling to maintain service stability.
- This position is a Level 3 support and SRE role with global responsibility for managing and providing support for these middleware products with on call coverage to handle production escalations
- Involve in day-to-day management of the infrastructure environment, troubleshooting with users, handling of changes, incidents, escalations, and problem management.
- The person would also be routinely working with engineering teams who developed these products to resolve problems and proactively automate operational and user processes to reduce toil and time to market.
Skills
AnsibleAnsible Automation PlatformApache ZookeeperAPI ProxyCI/CDDockerGitGrafanaKubernetesLinuxLokiOpenShiftPrometheusPythonSplunkTerraformUnixVMWare
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free