Skip to content
mimi

Cloudera Data Engineer

Midis Group

On-site Today

About the role

Job Title

Cloudera Data Engineer

Job Scope

We are seeking an experienced Cloudera Data Engineer with strong hands‑on expertise in Cloudera CDP, Hadoop, Spark, Hive, and HDFS. The ideal candidate will be responsible for building and maintaining big data pipelines, optimizing data workflows, and supporting production environments to ensure reliability and performance.

Main Duties and Responsibilities

  • Design, build, and maintain big data pipelines using Cloudera CDP components.
  • Develop and optimize ETL/ELT workflows using Spark, Hive, and Hadoop ecosystems.
  • Manage and maintain HDFS storage, ensuring high availability and performance.
  • Perform production support activities, including issue resolution, monitoring, and performance tuning.
  • Work with data scientists, analysts, and other engineering teams to support data-driven data governance, security, and compliance across big data environments.
  • Troubleshoot cluster‑level issues related to resource utilization, scheduling, and in solution design, architecture discussions, and technical

Technical Requirements

  • Strong hands‑on experience with Cloudera CDP.
  • Proficiency in Hadoop ecosystem: HDFS, YARN, MapReduce, Hive, Impala.
  • Strong experience working with Apache Spark (PySpark or Scala preferred).
  • Solid understanding of distributed systems and big data architecture.
  • Experience with production support and troubleshooting in big data environments.
  • Strong SQL skills and experience optimizing large‑scale queries.
  • Familiarity with Linux, shell scripting, and automation tools.
  • Experience with version control (Git) and CI/CD workflows is a plus.

Education

  • Bachelor's degree in Computer Science or any other related field

Experience

  • Experience with cloud platforms (AWS, Azure, GCP) for big data workloads.
  • Exposure to Kafka, NiFi, Oozie, or similar data orchestration tools.
  • Knowledge of Docker, Kubernetes, or containerized environments.
  • Understanding of security, encryption, and data governance concepts.

Requirements

  • Strong hands‑on experience with Cloudera CDP.
  • Proficiency in Hadoop ecosystem: HDFS, YARN, MapReduce, Hive, Impala.
  • Strong experience working with Apache Spark (PySpark or Scala preferred).
  • Solid understanding of distributed systems and big data architecture.
  • Experience with production support and troubleshooting in big data environments.
  • Strong SQL skills and experience optimizing large‑scale queries.
  • Familiarity with Linux, shell scripting, and automation tools.
  • Experience with version control (Git) and CI/CD workflows is a plus.

Responsibilities

  • Design, build, and maintain big data pipelines using Cloudera CDP components.
  • Develop and optimize ETL/ELT workflows using Spark, Hive, and Hadoop ecosystems.
  • Manage and maintain HDFS storage, ensuring high availability and performance.
  • Perform production support activities, including issue resolution, monitoring, and performance tuning.
  • Work with data scientists, analysts, and other engineering teams to support data-driven data governance, security, and compliance across big data environments.
  • Troubleshoot cluster‑level issues related to resource utilization, scheduling, and in solution design, architecture discussions, and technical

Skills

Cloudera CDPDockerGitGCPHadoopHDFSHiveImpalaKafkaKubernetesLinuxMapReduceNiFiOozieSparkSQLYARN

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free