Freelance Data Engineer / ML Engineer (Public Health Analytics)

ThreatXIntel

India · On-site Full-time Today

About the role

Role Overview

We are looking for a highly skilled Freelance Data Engineer / Machine Learning Engineer to build an end-to-end data pipeline and predictive analytics system focused on life expectancy modeling using public health and socio-economic data .

The ideal candidate should have strong experience in data engineering, big data processing, and machine learning , with the ability to work on real-world datasets and derive actionable insights .

Key Responsibilities

Data Engineering

Build scalable data pipelines using Python, SQL, and Apache Spark
Ingest data from APIs and public datasets (Census, healthcare, etc.)
Design multi-layer architecture:
- Bronze (raw data)
- Silver (cleaned data)
- Gold (aggregated/feature-ready data)
Perform data transformation, joins, and aggregation at regional/community level

Feature Engineering

Develop key health indicators such as:
- Mortality rates
- Poverty & unemployment rates
- Healthcare provider density
- Food accessibility metrics
Build composite indices like:
- Economic Hardship Index
- Health Access Index

Machine Learning

Develop predictive models using scikit-learn (Random Forest, Regression)
Evaluate models using:
- R² Score
- RMSE
Perform feature importance analysis to identify key drivers of life expectancy

Simulation & Insights

Build an interactive Life Expectancy Simulator
Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
Provide recommendations for policy and intervention strategies

Visualization & Reporting

Create dashboards using Power BI and Streamlit
Develop geospatial visualizations using Folium
Highlight disparities across communities and generate insights reports

Required Skills & Experience

Strong experience in:
- Python, SQL
- Data Engineering & ETL pipelines
- Big Data tools (Spark, Databricks)
Hands-on experience with:
- Machine Learning (scikit-learn)
- Feature engineering & model evaluation
Experience working with:
- Public datasets / APIs
- Data modeling & transformations
Good understanding of:
- Data pipelines (Bronze/Silver/Gold architecture)
- Statistical analysis and predictive modeling

Nice to Have

Experience in public health / healthcare analytics
Knowledge of geospatial data analysis
Experience building interactive dashboards or simulators
Exposure to cloud platforms (AWS / Azure / GCP)

Requirements

Strong experience in Python, SQL
Strong experience in Data Engineering & ETL pipelines
Strong experience in Big Data tools (Spark, Databricks)
Hands-on experience with Machine Learning (scikit-learn)
Hands-on experience with Feature engineering & model evaluation
Experience working with Public datasets / APIs
Experience working with Data modeling & transformations
Good understanding of Data pipelines (Bronze/Silver/Gold architecture)
Good understanding of Statistical analysis and predictive modeling

Responsibilities

Build scalable data pipelines using Python, SQL, and Apache Spark
Ingest data from APIs and public datasets (Census, healthcare, etc.)
Design multi-layer architecture: Bronze (raw data), Silver (cleaned data), Gold (aggregated/feature-ready data)
Perform data transformation, joins, and aggregation at regional/community level
Develop key health indicators such as: Mortality rates, Poverty & unemployment rates, Healthcare provider density, Food accessibility metrics
Build composite indices like: Economic Hardship Index, Health Access Index
Develop predictive models using scikit-learn (Random Forest, Regression)
Evaluate models using: R² Score, RMSE
Perform feature importance analysis to identify key drivers of life expectancy
Build an interactive Life Expectancy Simulator
Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
Provide recommendations for policy and intervention strategies
Create dashboards using Power BI and Streamlit
Develop geospatial visualizations using Folium
Highlight disparities across communities and generate insights reports

Skills

Apache SparkAWSAzureDatabricksFoliumGCPPythonPower BIscikit-learnSQLStreamlit

Similar roles

Platform Engineering Manager

Affinity.co

CA$90k – CA$110k/yr

Sr. AI Engineer

WebMobril Inc.

Cybersecurity Senior Engineer

Cox Automotive

$122k – $203k/yr

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free