Freelance Data Engineer / ML Engineer (Public Health Analytics)
About the role
About
As a highly skilled Freelance Data Engineer / Machine Learning Engineer, your main focus will be to build an end-to-end data pipeline and predictive analytics system for life expectancy modeling using public health and socio-economic data. You should possess strong experience in data engineering, big data processing, and machine learning in order to work with real-world datasets and derive actionable insights.
Key Responsibilities
- Data Engineering
- Build scalable data pipelines using Python, SQL, and Apache Spark
- Ingest data from APIs and public datasets (Census, healthcare, etc.)
- Design multi-layer architecture:
- Bronze (raw data)
- Silver (cleaned data)
- Gold (aggregated/feature-ready data)
- Perform data transformation, joins, and aggregation at regional/community level
- Feature Engineering
- Develop key health indicators such as:
- Mortality rates
- Poverty & unemployment rates
- Healthcare provider density
- Food accessibility metrics
- Build composite indices like:
- Economic Hardship Index
- Health Access Index
- Develop key health indicators such as:
- Machine Learning
- Develop predictive models using scikit-learn (Random Forest, Regression)
- Evaluate models using:
- R Score
- RMSE
- Perform feature importance analysis to identify key drivers of life expectancy
- Simulation & Insights
- Build an interactive Life Expectancy Simulator
- Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
- Provide recommendations for policy and intervention strategies
- Visualization & Reporting
- Create dashboards using Power BI and Streamlit
- Develop geospatial visualizations using Folium
- Highlight disparities across communities and generate insights reports
Required Skills & Experience
- Strong experience in:
- Python, SQL
- Data Engineering & ETL pipelines
- Big Data tools (Spark, Databricks)
- Hands-on experience with:
- Machine Learning (scikit-learn)
- Feature engineering & model evaluation
- Experience working with:
- Public datasets / APIs
- Data modeling & transformations
- Good understanding of:
- Data pipelines (Bronze/Silver/Gold architecture)
- Statistical analysis and predictive modeling
Nice to Have
- Experience in public health / healthcare analytics
- Knowledge of geospatial data analysis
- Experience building interactive dashboards or simulators
- Exposure to cloud platforms (AWS / Azure / GCP)
About
As a highly skilled Freelance Data Engineer / Machine Learning Engineer, your main focus will be to build an end-to-end data pipeline and predictive analytics system for life expectancy modeling using public health and socio-economic data. You should possess strong experience in data engineering, big data processing, and machine learning in order to work with real-world datasets and derive actionable insights.
Key Responsibilities
- Data Engineering
- Build scalable data pipelines using Python, SQL, and Apache Spark
- Ingest data from APIs and public datasets (Census, healthcare, etc.)
- Design multi-layer architecture:
- Bronze (raw data)
- Silver (cleaned data)
- Gold (aggregated/feature-ready data)
- Perform data transformation, joins, and aggregation at regional/community level
- Feature Engineering
- Develop key health indicators such as:
- Mortality rates
- Poverty & unemployment rates
- Healthcare provider density
- Food accessibility metrics
- Build composite indices like:
- Economic Hardship Index
- Health Access Index
- Develop key health indicators such as:
- Machine Learning
- Develop predictive models using scikit-learn (Random Forest, Regression)
- Evaluate models using:
- R Score
- RMSE
- Perform feature importance analysis to identify key drivers of life expectancy
- Simulation & Insights
- Build an interactive Life Expectancy Simulator
- Enable scenario-based analysis (e.g., impact of poverty reduction, healthcare improvements)
- Provide recommendations for policy and intervention strategies
- Visualization & Reporting
- Create dashboards using Power BI and Streamlit
- Develop geospatial visualizations using Folium
- Highlight disparities across communities and generate insights reports
Required Skills & Experience
- Strong experience in:
- Python, SQL
- Data Engineering & ETL pipelines
- Big Data tools (Spark, Databricks)
- Hands-on experience with:
- Machine Learning (scikit-learn)
- Feature engineering & model evaluation
- Experience working with:
- Public datasets / APIs
- Data modeling & transformations
- Good understanding of:
- Data pipelines (Bronze/Silver/Gold architecture)
- Statistical analysis and predictive modeling
Nice to Have
- Experience in public health / healthcare analytics
- Knowledge of geospatial data analysis
- Experience building interactive dashboards or simulators
- Exposure to cloud platforms (AWS / Azure / GCP)
Responsibilities
- Build scalable data pipelines using Python, SQL, and Apache Spark
- Ingest data from APIs and public datasets (Census, healthcare, etc.)
- Design multi‑layer architecture (Bronze, Silver, Gold) and perform data transformation, joins, and aggregation at regional/community level
- Develop key health indicators (mortality rates, poverty & unemployment rates, healthcare provider density, food accessibility metrics)
- Build composite indices such as Economic Hardship Index and Health Access Index
- Develop predictive models using scikit‑learn (Random Forest, Regression) and evaluate them with R‑Score and RMSE
- Perform feature importance analysis to identify key drivers of life expectancy
- Build an interactive Life Expectancy Simulator for scenario‑based analysis
- Provide policy and intervention recommendations based on model insights
- Create dashboards using Power BI and Streamlit and develop geospatial visualizations with Folium
- Generate insight reports highlighting community disparities
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free