Data Engineer
Greater New York Insurance Companies
About the role
Position Summary
Data Engineer with previous experience in the insurance domain, design, build, and maintain scalable data pipelines and models that power analytics and Generative AI initiatives. The Data Engineer will collaborate closely with analysts and engineers, to understand data needs, develop efficient solutions, and ensure the integrity, performance, and security of our data systems.
Essential Duties And Responsibilities
- Analyze integration and system requirements by understanding business needs and designing effective data solutions, particularly for Guidewire Policy, Billing Center and Commercial P&C data domains.
- Design, develop, and optimize ELT pipelines to ingest, transform, and load data into a Delta Lakehouse platform.
- Design, Develop and maintain data models and schemas ensuring data quality and integrity.
- Build and maintain dashboards and reports delivering actionable business insights.
- Monitor pipeline and storage performance; troubleshoot and resolve data issues promptly.
- Collaborate with cross-functional teams, including analysts and business users, to deliver end-to-end insurance data solutions.
- Implement data governance, security, and compliance standards across platforms.
- Conduct root cause analysis for system failures and performance events; drive continuous improvements in enterprise data integration pipelines.
- Create and manage testing procedures (unit, scenario, end-to-end) to ensure pipeline reliability.
- Stay current with emerging technologies and recommend improvements to workflows.
- Mentor team members on data engineering tools, Guidewire data model concepts and best practices.
- Build and maintain data pipelines to process structured and unstructured data (like documents and text) for Generative AI tasks, including creating embeddings and working with vector databases to support AI search features.
- Prepare and clean large datasets to ensure high-quality inputs for training and fine-tuning Generative AI models.
- Collaborate with AI and data science teams to understand data requirements and deliver scalable solutions that support model training and inference.
- Implement processes to enrich data with metadata and context to improve AI model accuracy and relevance.
- Optimize data storage and retrieval methods to support fast, low-latency responses for AI-powered applications.
- Monitor data workflows for Generative AI projects, troubleshooting issues, and ensure continuous pipeline performance.
Qualifications
Education and Experience:
- Bachelor’s degree in relevant field of study required(e.g. computer science, data science, data analytics, applied mathematics, etc.)
- 5+ years of progressive work experience in IT or a related field required.
- 3+ years of experience in data engineering and analytics, with a solid foundation in data architecture and integration, including hands-on work with complex enterprise P& C Insurance data models required.
- 2+ years of experience with data-centric projects within the Guidewire ecosystem, including working with Guidewire Policy Center and related data structures required.
- In-depth understanding of relational database systems (e.g., Oracle, SQL Server, MySQL), including their features and performance optimization strategies.
- Solid grasp of ETL processes, data pipeline architectures, and data integration techniques, particularly for operational source systems such as Guidewire.
- 2+ years of hands-on experience with Azure Databricks, Azure data factory including developing and optimizing data pipelines using Apache Spark required.
- 2+ years of experience working with Power BI or other leading data visualization and reporting tools required.
- Proven expertise with Apache Spark, Delta Lakehouse, and data warehousing technologies required.
- Proficient in Microsoft Azure services, including:
- Azure SQL Database
- Azure Data Lake Storage Gen2
- Azure Event Grid
- Azure Key Vault
- Strong understanding of CI/CD pipelines and experience in Agile development environments.
- Demonstrated ability to troubleshoot system issues, identify root causes, and implement effective solutions quickly.
- Capable of managing multiple priorities with strong attention to detail and follow-through.
- Working knowledge of Generative AI frameworks and use cases in data engineering is a plus.
- Knowledge of data governance, metadata management, and data quality frameworks.
- Understanding of data security and privacy principles, including encryption, anonymization, and access control mechanisms.
- Proficient in Microsoft Office Suites
Skills
- Strong understanding of the Insurance Domain and Experience using Guidewire CDA data model to build use case specific datasets.
- Data Modeling
- Expertise in Azure Databricks, Azure Data Factory, Apache Spark, and data pipeline development for scalable data engineering solutions
- Collaboration with cross-functional groups
- Strong analytical and problem-solving skills, with the ability to translate business requirements into practical data solutions.
Compensation
The salary range for this role is $61,700 - $109,500. The listed annual salary range posted for this position is subject to change and may vary depending on performance, education, experience, skills, geographic location, travel requirements, demonstrated proficiency in the competencies required for the role and business needs. Base pay is just one component of GNY’s total compensation package for employees. Other rewards include eligibility for an annual discretionary bonus based on performance.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free