Skip to content
mimi

Freelance Data Engineer / ML Engineer (Public Health Analytics)

LinkedIn

Remote · India Full-time Senior Today

About the role

About

As a highly skilled Freelance Data Engineer / Machine Learning Engineer, your main focus will be to build an end-to-end data pipeline and predictive analytics system for life expectancy modeling using public health and socio-economic data. You should possess strong experience in data engineering, big data processing, and machine learning in order to work with real-world datasets and derive actionable insights.

Key Responsibilities

  • Data Engineering
    • Build scalable data pipelines using Python, SQL, and Apache Spark
    • Ingest data from APIs and public datasets (Census, healthcare, etc.)
    • Design multi-layer architecture:
      • Bronze (raw data)
      • Silver (cleaned data)
      • Gold (aggregated/feature-ready data)
    • Perform data transformation, joins, and aggregation at regional/community level
  • Feature Engineering
    • Develop key health indicators such as:
      • Mortality rates
      • Poverty & unemployment rates
      • Healthcare provider density
      • Food accessibility metrics
    • Build composite indices like:
      • Economic Hardship Index
      • Health Access Index
  • Machine Learning
    • Develop predictive models using scikit-learn (Random Forest, Regression)
    • Evaluate models using:
      • R Score
      • RMSE
    • Perform feature importance analysis to identify key drivers of life expectancy
  • Simulation & Insights
    • Build an interactive Life Expectancy Simulator
    • Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
    • Provide recommendations for policy and intervention strategies
  • Visualization & Reporting
    • Create dashboards using Power BI and Streamlit
    • Develop geospatial visualizations using Folium
    • Highlight disparities across communities and generate insights reports

Required Skills & Experience

  • Strong experience in:
    • Python, SQL
    • Data Engineering & ETL pipelines
    • Big Data tools (Spark, Databricks)
  • Hands-on experience with:
    • Machine Learning (scikit-learn)
    • Feature engineering & model evaluation
  • Experience working with:
    • Public datasets / APIs
    • Data modeling & transformations
  • Good understanding of:
    • Data pipelines (Bronze/Silver/Gold architecture)
    • Statistical analysis and predictive modeling

Nice to Have

  • Experience in public health / healthcare analytics
  • Knowledge of geospatial data analysis
  • Experience building interactive dashboards or simulators
  • Exposure to cloud platforms (AWS / Azure / GCP)

About

As a highly skilled Freelance Data Engineer / Machine Learning Engineer, your main focus will be to build an end-to-end data pipeline and predictive analytics system for life expectancy modeling using public health and socio-economic data. You should possess strong experience in data engineering, big data processing, and machine learning in order to work with real-world datasets and derive actionable insights.

Key Responsibilities

  • Data Engineering
    • Build scalable data pipelines using Python, SQL, and Apache Spark
    • Ingest data from APIs and public datasets (Census, healthcare, etc.)
    • Design multi-layer architecture:
      • Bronze (raw data)
      • Silver (cleaned data)
      • Gold (aggregated/feature-ready data)
    • Perform data transformation, joins, and aggregation at regional/community level
  • Feature Engineering
    • Develop key health indicators such as:
      • Mortality rates
      • Poverty & unemployment rates
      • Healthcare provider density
      • Food accessibility metrics
    • Build composite indices like:
      • Economic Hardship Index
      • Health Access Index
  • Machine Learning
    • Develop predictive models using scikit-learn (Random Forest, Regression)
    • Evaluate models using:
      • R Score
      • RMSE
    • Perform feature importance analysis to identify key drivers of life expectancy
  • Simulation & Insights
    • Build an interactive Life Expectancy Simulator
    • Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
    • Provide recommendations for policy and intervention strategies
  • Visualization & Reporting
    • Create dashboards using Power BI and Streamlit
    • Develop geospatial visualizations using Folium
    • Highlight disparities across communities and generate insights reports

Required Skills & Experience

  • Strong experience in:
    • Python, SQL
    • Data Engineering & ETL pipelines
    • Big Data tools (Spark, Databricks)
  • Hands-on experience with:
    • Machine Learning (scikit-learn)
    • Feature engineering & model evaluation
  • Experience working with:
    • Public datasets / APIs
    • Data modeling & transformations
  • Good understanding of:
    • Data pipelines (Bronze/Silver/Gold architecture)
    • Statistical analysis and predictive modeling

Nice to Have

  • Experience in public health / healthcare analytics
  • Knowledge of geospatial data analysis
  • Experience building interactive dashboards or simulators
  • Exposure to cloud platforms (AWS / Azure / GCP)

Responsibilities

  • Build scalable data pipelines using Python, SQL, and Apache Spark
  • Ingest data from APIs and public datasets (Census, healthcare, etc.)
  • Design multi‑layer architecture (Bronze, Silver, Gold) and perform data transformation, joins, and aggregation at regional/community level
  • Develop key health indicators (mortality rates, poverty & unemployment rates, healthcare provider density, food accessibility metrics)
  • Build composite indices such as Economic Hardship Index and Health Access Index
  • Develop predictive models using scikit‑learn (Random Forest, Regression) and evaluate them with R‑Score and RMSE
  • Perform feature importance analysis to identify key drivers of life expectancy
  • Build an interactive Life Expectancy Simulator for scenario‑based analysis
  • Provide policy and intervention recommendations based on model insights
  • Create dashboards using Power BI and Streamlit and develop geospatial visualizations with Folium
  • Generate insight reports highlighting community disparities

Skills

PythonSQLData EngineeringETL pipelinesApache SparkDatabricksscikit‑learnMachine LearningFeature engineeringModel evaluationPublic datasets / APIsData modeling & transformationsBronze/Silver/Gold data architectureStatistical analysisPredictive modeling

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free