Skip to content
mimi

Lead Data Engineer-PySpark, RedShift, Airflow, AWS

Zortech Solutions

Valhalla · On-site Contract Lead Today

About the role

About

Candidate should have 12+ years of experience in Data Engineering. Must have strong work experience with onshore-offshore model

Responsibilities

  • Designing, creating, testing and maintaining the complete data management & processing systems.
  • Candidate need to have in depth understanding of how data pipelines are built
  • Typical challenges with fetching data from various sources.
  • How incremental/CDC data flows are handled.
  • How do you ensure data quality
  • How do you do Data profiling
  • Should be able to design and document data model at various levels
  • Working closely with the stakeholders.
  • Building highly scalable, robust & fault-tolerant systems.
  • Discovering data acquisitions opportunities
  • Finding ways & methods to find value out of existing data.
  • Improving data quality, reliability & efficiency of the individual components & the complete system.

Requirements

  • Hands-on experience with PySpark, Redshift (SQL) and Airflow at minimum
  • Strong hands-on with required tech skills, flexible, right attitude to play the lead role
  • Knowledge of Hadoop ecosystem and different frameworks inside it - HDFS, YARN, MapReduce, Apache Pig, Hive, Flume, Sqoop, ZooKeeper, Oozie, Impala and Kafka
  • Must have experience on SQL-based technologies (e.g. MySQL/ Oracle DB) and NoSQL technologies (e.g. Cassandra and MongoDB)
  • Should have Python/Scala/Java Programming skills
  • Problem solving mindset working in agile environment

Skills

AirflowApache PigCassandraFlumeHDFSHiveHadoopImpalaJavaKafkaMapReduceMongoDBMySQLNoSQLOracle DBOoziePythonPySparkRedshiftScalaSQLSqoopYARNZooKeeper

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free