Backend / Platform Engineer (Linux & Security)
Tekskills Inc.
About the role
Role
Backend / Platform Engineer (Linux & Security)
Location
NYC, NY
Duration
12+ months
Project Overview
The Compute Access Platform team is responsible for securing access to a wide array of Bloomberg's compute resources. This includes building and operating solutions for both interactive and non-interactive engineer access, as well as managing secure inter-service communication. These solutions span diverse environments such as Unix, Windows, Appliances, and Network Routers.
This project encompasses maintenance, enhancement, support, validation, testing, continuous integration, monitoring, configuration, and bug‑fix activities for the assigned application areas throughout the defined project period.
Key initiatives include
- Hardware refresh and refactoring services running on them
- Refactoring monolithic applications into functional services deployed to dedicated clusters
- Internal audit remediation
- Enhanced restricted interactive shell containers
Project Scope
Secure SUDO Rules Delivery
- Technologies: Python, Linux
- Secure and protect sudo rules endpoint with OAuth token or equivalent.
- Build monitoring and self‑service deletion usage capabilities.
- Timeframe: Q2 2026 – Q4 2026
SUDO Rule Migration and Recertification
- Technologies: Python, Linux INIT
- Migrate existing rules into the target system with recertification.
- Deliverable: Existing rules migrated to the target system.
- Timeframe: Q1 2027 – Q2 2027
BSHELL Hardware Refresh
- Technologies: Python, Load Balancer Concepts, REST Services, MySQL
- Refactor code and develop services.
- Build enhanced gateway shell for BSHELL.
- Setup new infrastructure.
- Open required connectivity and deploy using staged rollout via Chef.
- Timeframe: Q2 2026 – Q4 2026
BAMGW Hardware Refresh and Gateway Shell Rearchitecture
- BAMGW is used as a jump server by PRQS PW, CP, and CT to access the appliance fleet. It is tagged for ESX migration and serves as critical infrastructure for appliance access.
- Activities:
- Infrastructure setup
- Connectivity establishment
- New gateway shell development
- Traffic enablement
- Timeframe: Q2 2026 – Q2 2027
NMSGW Hardware Refresh and Rearchitecture (getrouterwin)
- The NMSGW gateway is currently unmanaged and powers the getrouterwin functionality used by Network Engineering.
- Activities:
- Infrastructure setup
- Connectivity establishment
- New gateway shell development
- Traffic enablement
- Timeframe: Q1 2027 – Q3 2027
Sudo Rule Migration and Recertification (Service Rewrite)
- Technologies: Python
- Rewrite existing Python service responsible for creating compute objects in Active Directory when cluster changes occur (new or modified clusters).
- Timeframe: Q4 2026 – Q2 2027
OP1 Containers
- This approach involves using containers configured to block write access to the host file system. It also enables safe deployment of additional debugging tools that are restricted or unsafe in the current rbash OP1 environment.
- Activities:
- Enhance the proof‑of‑concept into a production‑grade restricted shell/container.
- Gradual rollout with monitoring and user feedback.
- Fleet‑wide rollout.
- Timeframe: Q3 2026 – Q2 2027
Internal Audit Remediation
- Technologies: Python, Go, Chef (Ruby), INIT, Campaign
- Objectives:
- Remove persistent access to production infrastructure and replace with on‑demand access.
- Design and build monitoring and certification for high‑risk production access.
- Enforce default cluster restrictions for production Windows.
- Build reporting capabilities with clear certification paths.
- Timeframe: Q2 2026 – Q4 2027
Internal Audit Remediation – PRQS PW Support for Windows
- Technologies: Python, Teleport
- Timeframe: Q3 2026 – Q1 2027
PRQS PW Migration to OPA
- Technology: Go
- Timeframe: Q3 2026 – Q2 2027
Teleport Expansion for Public Cloud Resources
- Technologies: Python, Go
- Enable cloud compute resource access through Teleport. Begin with AWS and design for extensibility across other cloud providers.
- Activities:
- Design
- Proof of Concept
- Implementation
- Timeframe: Q1 2026 – Q4 2027
System Security Chef Recipe Refactoring
- Technology: Chef (Ruby)
- Activities:
- Remove obsolete code
- Move to MSE/applications cluster‑specific configurations
- Enhance logging, monitoring, alerting, and dashboards for core Chef client services such as:
- appssh
- sudo
- sshd
- Timeframe: Q1 2026 – Q4 2026
INFR Integration for Post‑Decommission Cleanup
- Technologies: Python, Go, Unix, Kafka
- When a machine or cluster is decommissioned, remove compute access artifacts including: appssh, sudo, GPO. This prevents issues if another host is created later using the same name.
- Activities:
- Design and POC
- Implementation
- Timeframe: Q1 2027 – Q4 2027
SOR for System Security Data
- The goal of this initiative is to push system security machine attributes into SOR.
- Attributes include:
- AD domain membership status
- Active Directory domain
- SSSD or VASD version
- Crypto policy
- SSSD version
- OpenSSH version
- Activities:
- Design
- Implementation in partnership with SOR
- Timeframe: Q2 2027 – Q4 2027
Teleport Resiliency and Stability Improvements
- Develop automation to reduce complexity and improve the stability of Teleport deployments and configuration changes.
- Activities:
- Build Teleport beta cluster
- Introduce additional labels to improve user experience
- Develop a client utility for end compute devices to enhance user experience
- Implementation will occur in phases.
- Timeframe: Q3 2026 – Q4 2027
Required Skills
The project requires the following technical skills:
- Middleware development (Go, Python, Nginx)
- Web services development (SOAP, REST)
- Programming across multiple stacks (Python, Go, Ruby preferred)
- Distributed systems
- Database systems (MySQL)
- Distributed caching (Redis, etcd)
- Automation, CI systems, and scripting
- Linux OS
- Troubleshooting, debugging, performance evaluation, and issue resolution
Prioritization
Project work will be prioritized and assigned to staff in 2‑week or 4‑week sprints.
Assignments will be based on:
- Bloomberg DRQS tickets
- JIRA cards
- Technical specifications
- Design specifications
Staff members are expected to submit effort estimates for each assignment.
Performance will be evaluated at the end of each sprint based on:
- Timeliness
- Code quality
- Completeness
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free