Field Service Delivery Engineer AI Infrastructure
SHI International Corp.
About the role
Overview
About Us
Since 1989, SHI International Corp. has helped organizations change the world through technology. We’ve grown every year since, and today we’re proud to be a $16 billion global provider of IT solutions and services. Over 17,000 organizations worldwide rely on SHI’s concierge approach to help them solve what’s next. The heartbeat of SHI is our employees – all 7,000 of them. If you join our team, you’ll enjoy:
- Our commitment to diversity, as the largest minority- and woman-owned enterprise in the U.S.
- Continuous professional growth and leadership opportunities.
- Health, wellness, and financial benefits to offer peace of mind to you and your family.
- World‑class facilities and the technology you need to thrive – in our offices or yours.
Job Summary
Role: The Service Delivery Engineer – AI Infrastructure is a senior, hands‑on engineering role within SHI’s Integration Data Center Solutions (IDCS) team. This position focuses on the deployment, integration, and validation of enterprise‑grade AI infrastructure—including NVIDIA Base Pods and OEM systems from Dell, HPE, and Lenovo—across both the SHI Data Center Factory and customer data centers.
This engineer will work directly with servers, storage, networking, Kubernetes clusters, and NVIDIA AI systems, ensuring that racks are fully integrated, burned in, configured, and ready for production. The role requires deep experience with enterprise hardware, Linux systems, Python, and data center networking, combined with strong customer‑facing communication and the ability to operate independently in dynamic environments. When not traveling for onsite deployments, the engineer will report to the SHI Data Center Factory in Piscataway, NJ.
Role Description
- Lead the end‑to‑end deployment of AI infrastructure, including rack/stack, cabling validation, system bring‑up, firmware updates, burn‑in, and cluster configuration.
- Integrate and validate NVIDIA Base Pods and OEM AI systems, including Base Command Manager, Kubernetes installation, and GPU node readiness.
- Perform advanced system engineering tasks, such as BIOS/firmware updates, redundancy testing, stress testing, and network configuration.
- Collaborate with SHI build technicians to ensure factory‑level integration is complete before systems ship to customer sites.
- Stand up and validate systems onsite at customer data centers, ensuring a seamless handoff into their environment.
- Troubleshoot complex hardware, Linux, and networking issues across compute, storage, and switching layers.
- Document configurations, runbooks, and deployment steps, contributing to repeatable and scalable delivery processes.
- Engage directly with customers, providing clear communication, technical guidance, and a professional onsite presence.
- Manage priorities and timelines independently, especially during onsite deployments where ambiguity and rapid problem solving are common.
- Travel as needed to customer locations for hands‑on deployment and integration work.
Behaviors and Competencies
- Initiative: Can proactively seek out challenges, initiate projects, and contribute to innovative ideas.
- Critical Thinking: Can apply critical thinking skills to complex problems, identifying logical and illogical reasoning, and making strategic decisions.
- Problem‑Solving: Can proactively identify potential problems, initiate preventive measures, and propose and contribute to innovative solutions.
- Teamwork: Can lead a team effectively, facilitating cooperation, sharing information, and ensuring that all team members are able to contribute to their full potential.
- Self‑Motivation: Can proactively seek out challenges, initiate self‑development projects, and contribute to personal or professional innovative ideas.
- Communication: Can effectively communicate complex ideas and information to diverse audiences and can facilitate effective communication between others.
- Technical Troubleshooting: Can proactively seek out potential technical problems, initiate preventive measures, and contribute to innovative solutions.
- Documentation: Can develop comprehensive documentation standards, implement best practices, and ensure documentation supports operational efficiency.
- Time‑Management: Can consistently use time effectively, balance multiple tasks, and meet deadlines.
- Organization: Can effectively coordinate multiple projects, delegate tasks where appropriate, and employ advanced organizational tools and methods.
Skill Level Requirements
- The ability to write, debug, and maintain code in programming languages such as Python, R, or Java to support AI initiatives – Intermediate
- The ability to effectively utilize applications like Word, Excel, PowerPoint, and Outlook to enhance productivity and perform various tasks efficiently – Intermediate
- Ability to simplify and effectively communicate complex problems to stakeholders across various functions and levels – Intermediate
- Expertise in installing, maintaining, and troubleshooting BIOS firmware and device drivers to ensure system functionality and performance – Intermediate
- Understanding of deploying and managing mobile devices to ensure seamless operation and integration within an organization – Preferred, Intermediate
- Experience with installing, configuring, and maintaining Linux‑based operating systems – Preferred, Intermediate
Other Requirements
- A bachelor’s degree in Computer Science, Engineering, Data Science, or a related field.
- 5+ years’ experience with enterprise data center hardware (servers, storage, networking)
- Strong Linux administration (install, configure, troubleshoot)
- Proficient in Python scripting for automation and system validation
- In‑depth knowledge of L2/L3 networking, VLANs, routing, switches, and cabling
- Experience with firmware/BIOS updates, device drivers, and system bring‑up
- Hands‑on experience deploying Kubernetes or containerized environments
- Familiarity with NVIDIA GPU systems, orchestration tools (e.g., Base Command Manager)
- Excellent written and verbal communication skills for customer interactions
- Self‑driven, proactive, and able to work independently
- Quick learner, able to adapt to evolving AI infrastructure technologies
- Ability to lift/move up to 50 lbs and work in data center environments
- Ability to travel and work flexible hours, including weekends, as needed
Preferred Skills
- Experience deploying AI or HPC infrastructure in enterprise environments.
- Strong understanding of data center networking and security best practices.
- Experience with OEM enterprise hardware (Dell, HPE, Lenovo, Supermicro).
- Certifications such as Linux+, CCNA/CCNP, NVIDIA certifications, or data center‑focused credentials.
- Experience working with diverse teams and adapting to different customer cultures and environments.
Compensation and Benefits
The estimated annual pay range for this position is $90,000 – $150,000 which includes a base salary. The compensation for this position is dependent on job‑related knowledge, skills, experience, and market location and, therefore, will vary from individual to individual. Benefits may include, but are not limited to, medical, vision, dental, 401K, and flexible spending.
Equal Employment Opportunity
M/F/Disability/Protected Veteran Status.
Requirements
- 5+ years’ experience with enterprise data center hardware (servers, storage, networking)
- Strong Linux administration (install, configure, troubleshoot)
- Proficient in Python scripting for automation and system validation
- In-depth knowledge of L2/L3 networking, VLANs, routing, switches, and cabling
- Experience with firmware/BIOS updates, device drivers, and system bring-up
- Hands-on experience deploying Kubernetes or containerized environments
- Familiarity with NVIDIA GPU systems, orchestration tools (e.g., Base Command Manager)
- Excellent written and verbal communication skills for customer interactions
- Self-driven, proactive, and able to work independently
- Quick learner, able to adapt to evolving AI infrastructure technologies
- Ability to lift/move up to 50 lbs and work in data center environments
- Ability to travel and work flexible hours, including weekends, as needed
Responsibilities
- Lead the end-to-end deployment of AI infrastructure, including rack/stack, cabling validation, system bring-up, firmware updates, burn-in, and cluster configuration.
- Integrate and validate NVIDIA Base Pods and OEM AI systems, including Base Command Manager, Kubernetes installation, and GPU node readiness.
- Perform advanced system engineering tasks, such as BIOS/firmware updates, redundancy testing, stress testing, and network configuration.
- Collaborate with SHI build technicians to ensure factory-level integration is complete before systems ship to customer sites.
- Stand up and validate systems onsite at customer data centers, ensuring a seamless handoff into their environment.
- Troubleshoot complex hardware, Linux, and networking issues across compute, storage, and switching layers.
- Document configurations, runbooks, and deployment steps, contributing to repeatable and scalable delivery processes.
- Engage directly with customers, providing clear communication, technical guidance, and a professional onsite presence.
- Manage priorities and timelines independently, especially during onsite deployments where ambiguity and rapid problem solving are common.
- Travel as needed to customer locations for hands-on deployment and integration work.
Benefits
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free