Freelance Data Engineer / ML Engineer (Public Health Analytics)

Remote · India Full-time Senior Today

About the role

About

As a highly skilled Freelance Data Engineer / Machine Learning Engineer, your main focus will be to build an end-to-end data pipeline and predictive analytics system for life expectancy modeling using public health and socio-economic data. You should possess strong experience in data engineering, big data processing, and machine learning in order to work with real-world datasets and derive actionable insights.

Key Responsibilities

Data Engineering
- Build scalable data pipelines using Python, SQL, and Apache Spark
- Ingest data from APIs and public datasets (Census, healthcare, etc.)
- Design multi-layer architecture:
  - Bronze (raw data)
  - Silver (cleaned data)
  - Gold (aggregated/feature-ready data)
- Perform data transformation, joins, and aggregation at regional/community level
Feature Engineering
- Develop key health indicators such as:
  - Mortality rates
  - Poverty & unemployment rates
  - Healthcare provider density
  - Food accessibility metrics
- Build composite indices like:
  - Economic Hardship Index
  - Health Access Index
Machine Learning
- Develop predictive models using scikit-learn (Random Forest, Regression)
- Evaluate models using:
  - R Score
  - RMSE
- Perform feature importance analysis to identify key drivers of life expectancy
Simulation & Insights
- Build an interactive Life Expectancy Simulator
- Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
- Provide recommendations for policy and intervention strategies
Visualization & Reporting
- Create dashboards using Power BI and Streamlit
- Develop geospatial visualizations using Folium
- Highlight disparities across communities and generate insights reports

Required Skills & Experience

Strong experience in:
- Python, SQL
- Data Engineering & ETL pipelines
- Big Data tools (Spark, Databricks)
Hands-on experience with:
- Machine Learning (scikit-learn)
- Feature engineering & model evaluation
Experience working with:
- Public datasets / APIs
- Data modeling & transformations
Good understanding of:
- Data pipelines (Bronze/Silver/Gold architecture)
- Statistical analysis and predictive modeling

Nice to Have

Experience in public health / healthcare analytics
Knowledge of geospatial data analysis
Experience building interactive dashboards or simulators
Exposure to cloud platforms (AWS / Azure / GCP)

About

Key Responsibilities

Data Engineering
- Build scalable data pipelines using Python, SQL, and Apache Spark
- Ingest data from APIs and public datasets (Census, healthcare, etc.)
- Design multi-layer architecture:
  - Bronze (raw data)
  - Silver (cleaned data)
  - Gold (aggregated/feature-ready data)
- Perform data transformation, joins, and aggregation at regional/community level
Feature Engineering
- Develop key health indicators such as:
  - Mortality rates
  - Poverty & unemployment rates
  - Healthcare provider density
  - Food accessibility metrics
- Build composite indices like:
  - Economic Hardship Index
  - Health Access Index
Machine Learning
- Develop predictive models using scikit-learn (Random Forest, Regression)
- Evaluate models using:
  - R Score
  - RMSE
- Perform feature importance analysis to identify key drivers of life expectancy
Simulation & Insights
- Build an interactive Life Expectancy Simulator
- Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
- Provide recommendations for policy and intervention strategies
Visualization & Reporting
- Create dashboards using Power BI and Streamlit
- Develop geospatial visualizations using Folium
- Highlight disparities across communities and generate insights reports

Required Skills & Experience

Strong experience in:
- Python, SQL
- Data Engineering & ETL pipelines
- Big Data tools (Spark, Databricks)
Hands-on experience with:
- Machine Learning (scikit-learn)
- Feature engineering & model evaluation
Experience working with:
- Public datasets / APIs
- Data modeling & transformations
Good understanding of:
- Data pipelines (Bronze/Silver/Gold architecture)
- Statistical analysis and predictive modeling

Nice to Have

Experience in public health / healthcare analytics
Knowledge of geospatial data analysis
Experience building interactive dashboards or simulators
Exposure to cloud platforms (AWS / Azure / GCP)

Responsibilities

Build scalable data pipelines using Python, SQL, and Apache Spark
Ingest data from APIs and public datasets (Census, healthcare, etc.)
Design multi‑layer architecture (Bronze, Silver, Gold) and perform data transformation, joins, and aggregation at regional/community level
Develop key health indicators (mortality rates, poverty & unemployment rates, healthcare provider density, food accessibility metrics)
Build composite indices such as Economic Hardship Index and Health Access Index
Develop predictive models using scikit‑learn (Random Forest, Regression) and evaluate them with R‑Score and RMSE
Perform feature importance analysis to identify key drivers of life expectancy
Build an interactive Life Expectancy Simulator for scenario‑based analysis
Provide policy and intervention recommendations based on model insights
Create dashboards using Power BI and Streamlit and develop geospatial visualizations with Folium
Generate insight reports highlighting community disparities

Skills

PythonSQLData EngineeringETL pipelinesApache SparkDatabricksscikit‑learnMachine LearningFeature engineeringModel evaluationPublic datasets / APIsData modeling & transformationsBronze/Silver/Gold data architectureStatistical analysisPredictive modeling

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Freelance Data Engineer / ML Engineer (Public Health Analytics)

About the role

About

Key Responsibilities

Required Skills & Experience

Nice to Have

About

Key Responsibilities

Required Skills & Experience

Nice to Have

Responsibilities

Skills

Similar roles

Machine Learning Engineer (ML Ops & Pipelines)

Data Engineer/ETL

Site Reliability Engineer

Don't send a generic resume