Skip to content
mimi

Backend / Platform Engineer (Linux & Security)

Tekskills Inc.

New York · On-site Full-time 2d ago

About the role

Role

Backend / Platform Engineer (Linux & Security)

Location

NYC, NY

Duration

12+ months

Project Overview

The Compute Access Platform team is responsible for securing access to a wide array of Bloomberg's compute resources. This includes building and operating solutions for both interactive and non-interactive engineer access, as well as managing secure inter-service communication. These solutions span diverse environments such as Unix, Windows, Appliances, and Network Routers.
This project encompasses maintenance, enhancement, support, validation, testing, continuous integration, monitoring, configuration, and bug‑fix activities for the assigned application areas throughout the defined project period.

Key initiatives include

  • Hardware refresh and refactoring services running on them
  • Refactoring monolithic applications into functional services deployed to dedicated clusters
  • Internal audit remediation
  • Enhanced restricted interactive shell containers

Project Scope

Secure SUDO Rules Delivery

  • Technologies: Python, Linux
  • Secure and protect sudo rules endpoint with OAuth token or equivalent.
  • Build monitoring and self‑service deletion usage capabilities.
  • Timeframe: Q2 2026 – Q4 2026

SUDO Rule Migration and Recertification

  • Technologies: Python, Linux INIT
  • Migrate existing rules into the target system with recertification.
  • Deliverable: Existing rules migrated to the target system.
  • Timeframe: Q1 2027 – Q2 2027

BSHELL Hardware Refresh

  • Technologies: Python, Load Balancer Concepts, REST Services, MySQL
  • Refactor code and develop services.
  • Build enhanced gateway shell for BSHELL.
  • Setup new infrastructure.
  • Open required connectivity and deploy using staged rollout via Chef.
  • Timeframe: Q2 2026 – Q4 2026

BAMGW Hardware Refresh and Gateway Shell Rearchitecture

  • BAMGW is used as a jump server by PRQS PW, CP, and CT to access the appliance fleet. It is tagged for ESX migration and serves as critical infrastructure for appliance access.
  • Activities:
    • Infrastructure setup
    • Connectivity establishment
    • New gateway shell development
    • Traffic enablement
  • Timeframe: Q2 2026 – Q2 2027

NMSGW Hardware Refresh and Rearchitecture (getrouterwin)

  • The NMSGW gateway is currently unmanaged and powers the getrouterwin functionality used by Network Engineering.
  • Activities:
    • Infrastructure setup
    • Connectivity establishment
    • New gateway shell development
    • Traffic enablement
  • Timeframe: Q1 2027 – Q3 2027

Sudo Rule Migration and Recertification (Service Rewrite)

  • Technologies: Python
  • Rewrite existing Python service responsible for creating compute objects in Active Directory when cluster changes occur (new or modified clusters).
  • Timeframe: Q4 2026 – Q2 2027

OP1 Containers

  • This approach involves using containers configured to block write access to the host file system. It also enables safe deployment of additional debugging tools that are restricted or unsafe in the current rbash OP1 environment.
  • Activities:
    • Enhance the proof‑of‑concept into a production‑grade restricted shell/container.
    • Gradual rollout with monitoring and user feedback.
    • Fleet‑wide rollout.
  • Timeframe: Q3 2026 – Q2 2027

Internal Audit Remediation

  • Technologies: Python, Go, Chef (Ruby), INIT, Campaign
  • Objectives:
    • Remove persistent access to production infrastructure and replace with on‑demand access.
    • Design and build monitoring and certification for high‑risk production access.
    • Enforce default cluster restrictions for production Windows.
    • Build reporting capabilities with clear certification paths.
  • Timeframe: Q2 2026 – Q4 2027

Internal Audit Remediation – PRQS PW Support for Windows

  • Technologies: Python, Teleport
  • Timeframe: Q3 2026 – Q1 2027

PRQS PW Migration to OPA

  • Technology: Go
  • Timeframe: Q3 2026 – Q2 2027

Teleport Expansion for Public Cloud Resources

  • Technologies: Python, Go
  • Enable cloud compute resource access through Teleport. Begin with AWS and design for extensibility across other cloud providers.
  • Activities:
    • Design
    • Proof of Concept
    • Implementation
  • Timeframe: Q1 2026 – Q4 2027

System Security Chef Recipe Refactoring

  • Technology: Chef (Ruby)
  • Activities:
    • Remove obsolete code
    • Move to MSE/applications cluster‑specific configurations
    • Enhance logging, monitoring, alerting, and dashboards for core Chef client services such as:
      • appssh
      • sudo
      • sshd
  • Timeframe: Q1 2026 – Q4 2026

INFR Integration for Post‑Decommission Cleanup

  • Technologies: Python, Go, Unix, Kafka
  • When a machine or cluster is decommissioned, remove compute access artifacts including: appssh, sudo, GPO. This prevents issues if another host is created later using the same name.
  • Activities:
    • Design and POC
    • Implementation
  • Timeframe: Q1 2027 – Q4 2027

SOR for System Security Data

  • The goal of this initiative is to push system security machine attributes into SOR.
  • Attributes include:
    • AD domain membership status
    • Active Directory domain
    • SSSD or VASD version
    • Crypto policy
    • SSSD version
    • OpenSSH version
  • Activities:
    • Design
    • Implementation in partnership with SOR
  • Timeframe: Q2 2027 – Q4 2027

Teleport Resiliency and Stability Improvements

  • Develop automation to reduce complexity and improve the stability of Teleport deployments and configuration changes.
  • Activities:
    • Build Teleport beta cluster
    • Introduce additional labels to improve user experience
    • Develop a client utility for end compute devices to enhance user experience
    • Implementation will occur in phases.
  • Timeframe: Q3 2026 – Q4 2027

Required Skills

The project requires the following technical skills:

  • Middleware development (Go, Python, Nginx)
  • Web services development (SOAP, REST)
  • Programming across multiple stacks (Python, Go, Ruby preferred)
  • Distributed systems
  • Database systems (MySQL)
  • Distributed caching (Redis, etcd)
  • Automation, CI systems, and scripting
  • Linux OS
  • Troubleshooting, debugging, performance evaluation, and issue resolution

Prioritization

Project work will be prioritized and assigned to staff in 2‑week or 4‑week sprints.
Assignments will be based on:

  • Bloomberg DRQS tickets
  • JIRA cards
  • Technical specifications
  • Design specifications

Staff members are expected to submit effort estimates for each assignment.
Performance will be evaluated at the end of each sprint based on:

  • Timeliness
  • Code quality
  • Completeness

Skills

ChefetcdGoINITKafkaLinuxLoad Balancer ConceptsMySQLNginxOAuthOpenSSHOP1PythonRedisRESTRubySOAPSSSDTeleportUnixVASD

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free