Skip to content
mimi

Storage Solutions Engineer

IBM

Bengaluru · On-site Full-time Senior Today

About the role

Introduction

IBM Infrastructure is a catalyst that makes the world work better because our clients demand it. Heterogeneous environments, the explosion of data, digital automation, and cybersecurity threats require hybrid cloud infrastructure that only IBM can provide. Your ability to be creative, a forward‑thinker and to focus on innovation that matters, is all supported by our growth‑minded culture as we continue to drive career development across our teams. Collaboration is key to IBM Infrastructure success, as we bring together different business units and teams that balance their priorities in a way that best serves our clients' needs. IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.

Mission

The Mission Of The IBM Storage Solutions Team Is To Engage With Strategic ISV And OEM Partners That May Advance The IBM Storage Strategy To Expand The Marketability Of The Portfolio, Create Short- And Long-term Alignment Of Solutions Roadmaps, And Engage In Collaborative GTM. The Team Will

  • Engage with the entire IBM Storage brand – product managers, engineers, technologists, researchers, marketers, and sellers - and the same with our technology partners worldwide
  • Prioritize and lead strategic client engagements with solutions engineering expertise
  • Design, build, and operate a world‑class strategically‑aligned ISV program

Our primary intention is to realize measurable, material, incremental business for the Storage product team accredited to our joint in‑market solutions with technology partners.

This group is comprised of technology and business professionals who are intent on bringing to market impactful customer solutions based on the IBM Storage assets of today and the future. It will have influence over our product roadmaps and will evangelize everything we have to offer to our partners, our clients, and the worldwide market broadly.

Your Role And Responsibilities

Storage Solutions Engineer - AI Infrastructure is a specialized role focused on architecting and deploying high‑performance data environments for large‑scale AI training and inference. It requires one to design, build, and maintain high‑performance storage systems for AI/ML workloads, bridging storage tech with AI needs (GPU clusters, data pipelines), requiring hands‑on skills in infrastructure automation, performance tuning, cloud platforms, and collaboration with data scientists to ensure scalable, secure, and reliable data flow for complex models.

Core Responsibilities

  • Architecture & Design: Lead end‑to‑end system design for distributed storage platforms tailored to AI/HPC workloads.
  • Performance Optimization: Maximize IOPS and throughput for multi‑node GPU clusters, ensuring storage systems can keep up with the demands of deep learning frameworks.
  • Infrastructure Automation: Build CI/CD and automation pipelines for provisioning and monitoring AI infrastructure using tools like Terraform, Ansible, and Kubernetes.
  • Solution Selection, Validation and Publishing: Evaluate and select next‑generation storage technologies such as NVMe‑oF (IBM Flashsystems), Ceph to support petabyte‑scale data.
  • AIOps Integration: Build CI/CD pipelines, monitoring and orchestration (Kubernetes) for AI/ML workflows.
  • Pre‑Sales & Strategy: Collaborate with customers, partners and field teams to translate customer business requirements into technical Bill of Materials (BOMs) and reference architectures.
  • Collaboration: Work with data scientists, ML engineers, and cloud teams to gather requirements and deliver solutions.
  • Troubleshooting: Resolve complex issues in distributed storage and AI environments, ensuring high availability – hands‑on.
  • Security & Compliance: Implement security best practices, access controls, and data encryption.

Preferred Education

  • Master's Degree

Required Technical And Professional Expertise

  • Strong Expertise with NVIDIA DGX/HGX systems, GPUs, and DPUs (e.g., NVIDIA BlueField) for storage‑offloading
  • Understanding and hands‑on experience of working with SAN, NAS, and Parallel File Systems (IBM Storage preferred) alongside protocols like NFS, SMB, and S3.
  • Proficiency in Linux, Networking and Storage Systems.
  • Sound understanding of data modelling, governance, and security frameworks.
  • Experience deploying hybrid or multi‑cloud AI solutions on HCI, AWS, Azure, or GCP and virtualization technologies such as Kubernetes, RHOS, VMWare.
  • Automation/Scripting: Proficiency in Python, Bash, or Go for developing custom monitoring and management tools.
  • Excellent problem‑solving and communication skills.
  • Ability to apply AI‑driven tools for rapid problem‑solving, data‑based decisions, and productivity improvement.
  • Experience: Typically 8‑10+ years in systems engineering and/or storage architecture.

Preferred Technical And Professional Experience

  • Technical Asset Creation: Exposure to creating and delivering technical assets to demonstrate user experience and related offering value props, with a strong understanding of domain and business expertise.
  • Market Trend Analysis: Experience working with market trend analysis, identifying target markets and opportunities based on user problems, trends, and market dynamics.
  • Product Technology Benchmarking: Exposure to performing robust product technology assessments and benchmarks to ensure differentiation, with the ability to determine an acceptable scope and schedule for releases in the context of other business requirements and technical debt.

Requirements

  • Strong Expertise with NVIDIA DGX/HGX systems, GPUs, and DPUs (e.g., NVIDIA BlueField) for storage-offloading.
  • Understanding and hands on experience of working with SAN, NAS, and Parallel File Systems (IBM Storage preferred) alongside protocols like NFS, SMB, and S3.
  • Proficiency in Linux, Networking and Storage Systems.
  • Sound understanding of data modelling, governance, and security frameworks.
  • Experience deploying hybrid or multi-cloud AI solutions on HCI, AWS, Azure, or GCP and virtualization technologies such as Kubernetes, RHOS, VMWare.
  • Proficiency in Python, Bash, or Go for developing custom monitoring and management tools.
  • Excellent problem-solving and communication skills.
  • Ability to apply AI‑driven tools for rapid problem‑solving, data‑based decisions, and productivity improvement.

Responsibilities

  • Lead end-to-end system design for distributed storage platforms tailored to AI/HPC workloads.
  • Maximize IOPS and throughput for multi-node GPU clusters, ensuring storage systems can keep up with the demands of deep learning frameworks.
  • Build CI/CD and automation pipelines for provisioning and monitoring AI infrastructure using tools like Terraform, Ansible, and Kubernetes.
  • Evaluate and select next-generation storage technologies such as NVMe-oF (IBM Flashsystems), Ceph to support petabyte-scale data.
  • Build CI/CD pipelines, monitoring and orchestration (Kubernetes) for AI/ML workflows.
  • Collaborate with customers, partners and field teams to translate customer business requirements into technical Bill of Materials (BOMs) and reference architectures.
  • Work with data scientists, ML engineers, and cloud teams to gather requirements and deliver solutions.
  • Resolve complex issues in distributed storage and AI environments, ensuring high availability - hands-on.
  • Implement security best practices, access controls, and data encryption.

Skills

AnsibleAWSBashCephCloudDockerGCPGoHCIIBM FlashsystemsKubernetesLinuxMLNASNFSNVMe-oFNVIDIA BlueFieldNVIDIA DGXNVIDIA HGXPythonParallel File SystemsRHELSANSMBS3TerraformVMware

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free