Data Engineer - Hybrid ( Mumbai /pune /Bangalore )

AgileEngine

Remote (Global) Full-time 3mo ago

About the role

I have 2 exciting Senior Data Engineer openings with a globally strategic data modernisation programme at one of the world's leading investment intelligence firms.

Both are hybrid roles based in Mumbai / Pune / Bangalore with 6-8 years experience requirement.

Please read both JDs carefully and let me know which one aligns with your experience. Do not proceed if you don't have the relevant hands-on skills — these are highly specific roles.

---

🔷 Position 1 — Microsoft / Azure / Fabric Stack For engineers with hands-on experience in Microsoft Fabric — OneLake, Fabric Data Factory, Delta Lake — and Azure cloud technologies. Strong Python and SQL required. Financial data experience is a strong plus.

🔷 Position 2 — Google Cloud Platform Stack For engineers with hands-on BigQuery, Cloud Composer (Airflow), and Dataproc (Spark) experience. Strong Python and SQL required. Financial data experience is a strong plus.

---

Both roles offer high ownership, global exposure and the opportunity to work on cutting edge data platform infrastructure.

If your experience aligns, please share: 1. Which position suits you and why 2. Email ID 3. Relevant Experience 4. CCTC / ECTC 5. Notice Period

⚠️ Please apply only if your hands-on experience directly matches the stack mentioned. Generic data engineering profiles without the specific cloud platform experience will not be considered.

--- ---

# DETAILED JOB DESCRIPTIONS

---

# 🔥 Position 1 — Data Engineer (Senior)

## Microsoft / Azure / Fabric Stack ### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years

🚀 Hybrid Opportunity | 6-8 Years Experience | Financial Data & Microsoft Fabric

We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on Microsoft Fabric as part of a platform that powers investment decision tools used across the globe.

This is a high ownership, high impact role — not just another pipeline job.

---

✅ Must-Have Skills:

• 6-8 years of hands-on data engineering experience • Strong Python programming — pipelines, transformation logic and automation • Proficient in SQL — window functions, partitioning and time-series query patterns • Hands-on experience with Microsoft Fabric — OneLake, Fabric Data Factory, Lakehouse and Warehouse • Working knowledge of Delta Lake — incremental merges, Z-ordering and Change Data Feed • Familiarity with Azure cloud technologies — ADF, Azure SQL, Key Vault and RBAC • REST API experience — consuming external vendor APIs and building service integrations • Git based collaboration — branching strategies, PR workflows and pipeline-as-code • AI assisted development tools — GitHub Copilot, Cursor or equivalent • Strong sense of ownership across ingestion, QA, correction management and audit trails • Excellent communication skills — you'll work with global cross functional teams across engineering, compliance and business

💼 Key Responsibilities:

• Build and maintain scalable distributed data pipelines on Microsoft Fabric including OneLake lakehouse layers and Delta Lake merge workflows • Design and implement bitemporal data models to support certified regulatory grade time-series datasets • Build and maintain software testing frameworks — unit, non-regression and user acceptance — for pipelines and transformation logic • Acquire, normalise, transform and release large volumes of financial market data • Support AI solution integration including AI assisted ingestion, anomaly detection and semantic search over the lakehouse • Collaborate actively with stakeholders across data engineering, compliance and business teams globally • Contribute to shared platform services — this is a platform role, not a vertical specific one

➕ Good to Have:

• Experience with pandas, PySpark or equivalent data manipulation libraries • Familiarity with Microsoft Purview for data lineage, cataloguing and sensitivity classification • Understanding of bitemporal data modelling for financial and regulatory datasets • Knowledge of financial reference data — equities, fixed income, corporate actions or index composition • Exposure to CI/CD pipelines and automated environment provisioning • Experience with LLMs and Agentic AI — anomaly detection, semantic search or natural language querying over structured data is a strong plus!

---

📋 Quick Check Before You Apply:

6-8 years in data engineering with strong Python, SQL, and hands-on Microsoft Fabric exposure — specifically OneLake, Fabric Data Factory, and Delta Lake? Comfortable with Azure and financial data at scale? Yes to all — apply. No Fabric experience? This one's not for you.

---

⚠️ IMPORTANT — Please ensure ALL of the following are explicitly mentioned in your resume before applying: • Microsoft Fabric — OneLake, Fabric Data Factory, Lakehouse, Warehouse • Delta Lake — incremental merges, Z-ordering, Change Data Feed • Python — data pipeline development and transformation logic • SQL — window functions, partitioning, time-series patterns • Azure technologies — ADF, Azure SQL, Key Vault, RBAC • Git based workflows • AI assisted development tools

Resumes that do not clearly reflect these skills will not be shortlisted.

---

📩 Interested candidates, please share: 1. Email ID 2. Relevant Experience 3. CCTC / ECTC 4. Notice Period

--- ---

# 🔥 Position 2 — Data Engineer (Senior) ## Google Cloud Platform Stack ### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years

🚀 Hybrid Opportunity | 6-8 Years Experience | Financial Data & Google Cloud Platform

We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on GCP as part of a platform that powers investment decision tools used across the globe.

This is a high ownership, high impact role — not just another pipeline job.

---

✅ Must-Have Skills:

• 6-8 years of hands-on data engineering experience • Strong Python programming — pipelines, transformation logic and automation • Proficient in SQL with strong hands-on BigQuery experience — partitioning, clustering, materialised views and time-series query patterns at scale • Hands-on experience with Cloud Composer (Apache Airflow) — DAG authoring, SLA alerting, retry logic and dependency management • Working knowledge of Dataproc (Apache Spark) — batch ingestion, Delta Lake merge operations and incremental data processing • Familiarity with GCP technologies — Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM and VPC Service Controls • REST API experience — consuming external vendor APIs and building service integrations • Git based collaboration — branching strategies, PR workflows and pipeline-as-code • AI assisted development tools — GitHub Copilot, Cursor or equivalent • Strong sense of ownership across ingestion, QA, correction management and audit trails • Excellent communication skills — you'll work with global cross functional teams across engineering, compliance and business

💼 Key Responsibilities:

• Build and maintain scalable distributed data pipelines on GCP including BigQuery based lakehouse layers and Dataproc driven Delta Lake workflows • Design and implement bitemporal data models on BigQuery to support certified regulatory grade time-series datasets • Build and maintain software testing frameworks — unit, non-regression and user acceptance — for pipelines and transformation logic • Acquire, normalise, transform and release large volumes of financial market data through the OMDP data factory • Support AI solution integration using Vertex AI — including AI assisted ingestion, anomaly detection and semantic search over the lakehouse • Collaborate actively with stakeholders across data engineering, compliance and business teams globally • Contribute to shared platform services — this is a platform role, not a vertical specific one

➕ Good to Have:

• Experience with pandas, PySpark or equivalent data manipulation libraries • Familiarity with Dataplex for data discovery, lineage, policy tagging and data quality rule management • Understanding of Change Data Capture patterns using Datastream for replicating transactional data into BigQuery • Understanding of bitemporal data modelling concepts within BigQuery's append optimised design • Knowledge of financial reference data — equities, fixed income, corporate actions or index composition • BigQuery cost management — slot reservations, query cost controls and workload isolation • Exposure to CI/CD pipelines and infrastructure as code using Terraform for GCP deployments • Prior experience with LLMs and Agentic AI using Vertex AI — anomaly detection, semantic search or natural language querying over structured data is a strong plus!

---

📋 Quick Check Before You Apply:

6-8 years in data engineering with strong Python, SQL, and hands-on GCP experience — specifically BigQuery, Cloud Composer, and Dataproc? Comfortable working with large volumes of financial data in a global cross-functional environment? Yes to all — apply. No GCP or BigQuery hands-on experience? This one's not for you.

---

⚠️ IMPORTANT — Please ensure ALL of the following are explicitly mentioned in your resume before applying: • GCP — Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM, VPC Service Controls • BigQuery — partitioning, clustering, materialised views, time-series query patterns • Cloud Composer (Apache Airflow) — DAG authoring, SLA alerting, retry logic • Dataproc (Apache Spark) — batch ingestion, Delta Lake merge operations • Python — data pipeline development and transformation logic • SQL — advanced query patterns at scale • Git based workflows • AI assisted development tools

Resumes that do not clearly reflect these skills will not be shortlisted.

---

📩 Interested candidates, please share: 1. Email ID 2. Relevant Experience 3. CCTC / ECTC 4. Notice Period

⚠️ Please apply only if your experience aligns with the requirements. Candidates with GCP and financial data experience will be prioritised.

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Data Engineer - Hybrid ( Mumbai /pune /Bangalore )

About the role

Similar roles

Accountant Trainee

Data Scientist/Engineer

Principal Information Security Systems Engineer (ISSE)

Don't send a generic resume