Data Engineer - Hybrid ( Mumbai /pune /Bangalore )
AgileEngine
About the role
I have 2 exciting Senior Data Engineer openings with a globally strategic data modernisation programme at one of the world's leading investment intelligence firms.
Both are hybrid roles based in Mumbai / Pune / Bangalore with 6-8 years experience requirement.
Please read both JDs carefully and let me know which one aligns with your experience. Do not proceed if you don't have the relevant hands-on skills ā these are highly specific roles.
---
š· Position 1 ā Microsoft / Azure / Fabric Stack For engineers with hands-on experience in Microsoft Fabric ā OneLake, Fabric Data Factory, Delta Lake ā and Azure cloud technologies. Strong Python and SQL required. Financial data experience is a strong plus.
š· Position 2 ā Google Cloud Platform Stack For engineers with hands-on BigQuery, Cloud Composer (Airflow), and Dataproc (Spark) experience. Strong Python and SQL required. Financial data experience is a strong plus.
---
Both roles offer high ownership, global exposure and the opportunity to work on cutting edge data platform infrastructure.
If your experience aligns, please share: 1. Which position suits you and why 2. Email ID 3. Relevant Experience 4. CCTC / ECTC 5. Notice Period
ā ļø Please apply only if your hands-on experience directly matches the stack mentioned. Generic data engineering profiles without the specific cloud platform experience will not be considered.
--- ---
# DETAILED JOB DESCRIPTIONS
---
# š„ Position 1 ā Data Engineer (Senior)
## Microsoft / Azure / Fabric Stack ### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years
š Hybrid Opportunity | 6-8 Years Experience | Financial Data & Microsoft Fabric
We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on Microsoft Fabric as part of a platform that powers investment decision tools used across the globe.
This is a high ownership, high impact role ā not just another pipeline job.
---
ā Must-Have Skills:
⢠6-8 years of hands-on data engineering experience ⢠Strong Python programming ā pipelines, transformation logic and automation ⢠Proficient in SQL ā window functions, partitioning and time-series query patterns ⢠Hands-on experience with Microsoft Fabric ā OneLake, Fabric Data Factory, Lakehouse and Warehouse ⢠Working knowledge of Delta Lake ā incremental merges, Z-ordering and Change Data Feed ⢠Familiarity with Azure cloud technologies ā ADF, Azure SQL, Key Vault and RBAC ⢠REST API experience ā consuming external vendor APIs and building service integrations ⢠Git based collaboration ā branching strategies, PR workflows and pipeline-as-code ⢠AI assisted development tools ā GitHub Copilot, Cursor or equivalent ⢠Strong sense of ownership across ingestion, QA, correction management and audit trails ⢠Excellent communication skills ā you'll work with global cross functional teams across engineering, compliance and business
š¼ Key Responsibilities:
⢠Build and maintain scalable distributed data pipelines on Microsoft Fabric including OneLake lakehouse layers and Delta Lake merge workflows ⢠Design and implement bitemporal data models to support certified regulatory grade time-series datasets ⢠Build and maintain software testing frameworks ā unit, non-regression and user acceptance ā for pipelines and transformation logic ⢠Acquire, normalise, transform and release large volumes of financial market data ⢠Support AI solution integration including AI assisted ingestion, anomaly detection and semantic search over the lakehouse ⢠Collaborate actively with stakeholders across data engineering, compliance and business teams globally ⢠Contribute to shared platform services ā this is a platform role, not a vertical specific one
ā Good to Have:
⢠Experience with pandas, PySpark or equivalent data manipulation libraries ⢠Familiarity with Microsoft Purview for data lineage, cataloguing and sensitivity classification ⢠Understanding of bitemporal data modelling for financial and regulatory datasets ⢠Knowledge of financial reference data ā equities, fixed income, corporate actions or index composition ⢠Exposure to CI/CD pipelines and automated environment provisioning ⢠Experience with LLMs and Agentic AI ā anomaly detection, semantic search or natural language querying over structured data is a strong plus!
---
š Quick Check Before You Apply:
6-8 years in data engineering with strong Python, SQL, and hands-on Microsoft Fabric exposure ā specifically OneLake, Fabric Data Factory, and Delta Lake? Comfortable with Azure and financial data at scale? Yes to all ā apply. No Fabric experience? This one's not for you.
---
ā ļø IMPORTANT ā Please ensure ALL of the following are explicitly mentioned in your resume before applying: ⢠Microsoft Fabric ā OneLake, Fabric Data Factory, Lakehouse, Warehouse ⢠Delta Lake ā incremental merges, Z-ordering, Change Data Feed ⢠Python ā data pipeline development and transformation logic ⢠SQL ā window functions, partitioning, time-series patterns ⢠Azure technologies ā ADF, Azure SQL, Key Vault, RBAC ⢠Git based workflows ⢠AI assisted development tools
Resumes that do not clearly reflect these skills will not be shortlisted.
---
š© Interested candidates, please share: 1. Email ID 2. Relevant Experience 3. CCTC / ECTC 4. Notice Period
--- ---
# š„ Position 2 ā Data Engineer (Senior) ## Google Cloud Platform Stack ### Mumbai / Pune / Bangalore | Hybrid | 6-8 Years
š Hybrid Opportunity | 6-8 Years Experience | Financial Data & Google Cloud Platform
We're looking for a strong Data Engineer to join a globally strategic data modernisation programme at one of the world's leading investment intelligence firms. You'll design, build and maintain state-of-the-art data pipelines on GCP as part of a platform that powers investment decision tools used across the globe.
This is a high ownership, high impact role ā not just another pipeline job.
---
ā Must-Have Skills:
⢠6-8 years of hands-on data engineering experience ⢠Strong Python programming ā pipelines, transformation logic and automation ⢠Proficient in SQL with strong hands-on BigQuery experience ā partitioning, clustering, materialised views and time-series query patterns at scale ⢠Hands-on experience with Cloud Composer (Apache Airflow) ā DAG authoring, SLA alerting, retry logic and dependency management ⢠Working knowledge of Dataproc (Apache Spark) ā batch ingestion, Delta Lake merge operations and incremental data processing ⢠Familiarity with GCP technologies ā Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM and VPC Service Controls ⢠REST API experience ā consuming external vendor APIs and building service integrations ⢠Git based collaboration ā branching strategies, PR workflows and pipeline-as-code ⢠AI assisted development tools ā GitHub Copilot, Cursor or equivalent ⢠Strong sense of ownership across ingestion, QA, correction management and audit trails ⢠Excellent communication skills ā you'll work with global cross functional teams across engineering, compliance and business
š¼ Key Responsibilities:
⢠Build and maintain scalable distributed data pipelines on GCP including BigQuery based lakehouse layers and Dataproc driven Delta Lake workflows ⢠Design and implement bitemporal data models on BigQuery to support certified regulatory grade time-series datasets ⢠Build and maintain software testing frameworks ā unit, non-regression and user acceptance ā for pipelines and transformation logic ⢠Acquire, normalise, transform and release large volumes of financial market data through the OMDP data factory ⢠Support AI solution integration using Vertex AI ā including AI assisted ingestion, anomaly detection and semantic search over the lakehouse ⢠Collaborate actively with stakeholders across data engineering, compliance and business teams globally ⢠Contribute to shared platform services ā this is a platform role, not a vertical specific one
ā Good to Have:
⢠Experience with pandas, PySpark or equivalent data manipulation libraries ⢠Familiarity with Dataplex for data discovery, lineage, policy tagging and data quality rule management ⢠Understanding of Change Data Capture patterns using Datastream for replicating transactional data into BigQuery ⢠Understanding of bitemporal data modelling concepts within BigQuery's append optimised design ⢠Knowledge of financial reference data ā equities, fixed income, corporate actions or index composition ⢠BigQuery cost management ā slot reservations, query cost controls and workload isolation ⢠Exposure to CI/CD pipelines and infrastructure as code using Terraform for GCP deployments ⢠Prior experience with LLMs and Agentic AI using Vertex AI ā anomaly detection, semantic search or natural language querying over structured data is a strong plus!
---
š Quick Check Before You Apply:
6-8 years in data engineering with strong Python, SQL, and hands-on GCP experience ā specifically BigQuery, Cloud Composer, and Dataproc? Comfortable working with large volumes of financial data in a global cross-functional environment? Yes to all ā apply. No GCP or BigQuery hands-on experience? This one's not for you.
---
ā ļø IMPORTANT ā Please ensure ALL of the following are explicitly mentioned in your resume before applying: ⢠GCP ā Cloud Storage, Pub/Sub, Datastream, Cloud Monitoring, IAM, VPC Service Controls ⢠BigQuery ā partitioning, clustering, materialised views, time-series query patterns ⢠Cloud Composer (Apache Airflow) ā DAG authoring, SLA alerting, retry logic ⢠Dataproc (Apache Spark) ā batch ingestion, Delta Lake merge operations ⢠Python ā data pipeline development and transformation logic ⢠SQL ā advanced query patterns at scale ⢠Git based workflows ⢠AI assisted development tools
Resumes that do not clearly reflect these skills will not be shortlisted.
---
š© Interested candidates, please share: 1. Email ID 2. Relevant Experience 3. CCTC / ECTC 4. Notice Period
ā ļø Please apply only if your experience aligns with the requirements. Candidates with GCP and financial data experience will be prioritised.
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free