MG
Cloudera Data Engineer
Midis Group
On-site Today
About the role
Job Title
Cloudera Data Engineer
Job Scope
We are seeking an experienced Cloudera Data Engineer with strong hands‑on expertise in Cloudera CDP, Hadoop, Spark, Hive, and HDFS. The ideal candidate will be responsible for building and maintaining big data pipelines, optimizing data workflows, and supporting production environments to ensure reliability and performance.
Main Duties and Responsibilities
- Design, build, and maintain big data pipelines using Cloudera CDP components.
- Develop and optimize ETL/ELT workflows using Spark, Hive, and Hadoop ecosystems.
- Manage and maintain HDFS storage, ensuring high availability and performance.
- Perform production support activities, including issue resolution, monitoring, and performance tuning.
- Work with data scientists, analysts, and other engineering teams to support data-driven data governance, security, and compliance across big data environments.
- Troubleshoot cluster‑level issues related to resource utilization, scheduling, and in solution design, architecture discussions, and technical
Technical Requirements
- Strong hands‑on experience with Cloudera CDP.
- Proficiency in Hadoop ecosystem: HDFS, YARN, MapReduce, Hive, Impala.
- Strong experience working with Apache Spark (PySpark or Scala preferred).
- Solid understanding of distributed systems and big data architecture.
- Experience with production support and troubleshooting in big data environments.
- Strong SQL skills and experience optimizing large‑scale queries.
- Familiarity with Linux, shell scripting, and automation tools.
- Experience with version control (Git) and CI/CD workflows is a plus.
Education
- Bachelor's degree in Computer Science or any other related field
Experience
- Experience with cloud platforms (AWS, Azure, GCP) for big data workloads.
- Exposure to Kafka, NiFi, Oozie, or similar data orchestration tools.
- Knowledge of Docker, Kubernetes, or containerized environments.
- Understanding of security, encryption, and data governance concepts.
Requirements
- Strong hands‑on experience with Cloudera CDP.
- Proficiency in Hadoop ecosystem: HDFS, YARN, MapReduce, Hive, Impala.
- Strong experience working with Apache Spark (PySpark or Scala preferred).
- Solid understanding of distributed systems and big data architecture.
- Experience with production support and troubleshooting in big data environments.
- Strong SQL skills and experience optimizing large‑scale queries.
- Familiarity with Linux, shell scripting, and automation tools.
- Experience with version control (Git) and CI/CD workflows is a plus.
Responsibilities
- Design, build, and maintain big data pipelines using Cloudera CDP components.
- Develop and optimize ETL/ELT workflows using Spark, Hive, and Hadoop ecosystems.
- Manage and maintain HDFS storage, ensuring high availability and performance.
- Perform production support activities, including issue resolution, monitoring, and performance tuning.
- Work with data scientists, analysts, and other engineering teams to support data-driven data governance, security, and compliance across big data environments.
- Troubleshoot cluster‑level issues related to resource utilization, scheduling, and in solution design, architecture discussions, and technical
Skills
Cloudera CDPDockerGitGCPHadoopHDFSHiveImpalaKafkaKubernetesLinuxMapReduceNiFiOozieSparkSQLYARN
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free