Data & Software Engineer

VTG

McLean · On-site Full-time Today

About the role

Overview

We are seeking a Data & Software Engineer works with a small team to build complex data flows for a custom application. Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale. Candidate must have experience:

What will you do?

Building end-to-end data pipelines leveraging Python
Using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs
Containerizing and deploying applications in cloud environments like AWS.
Working with MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads.
Leveraging industry standard tools for code control (Git, IaaC control, etc.)
Working with data catalogs, tracking data lineage and handling a variety of data formats, including Geospatial.
Using Bash scripting for automation and data processing tasks
Integrating Al/ML services and models
Work with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight
Leverage strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks
Leverage a background in large-scale data migration or platform modernization efforts
Contribute to data engineering documentation, best practices, and design patterns.

Do you have what it takes?

Active TS/SCI W/ Polygraph required.
Bachelor's degree in Computer Science, Engineering, Finance, or a related technical field, or equivalent practical experience.
Minimum of 5 years' experience with:
- Apache Spark & PySpark
- Advanced Python skills (including Pandas & NumPy)
- Docker, Podman
- AWS S3, Lambda & Step functions
- Apache Iceberg, Airflow, etc.
- SQL (with Trino)
- NoSQL, DynamoDB
- Unity Catalog OSS, Apache Polaris
- Apache Superset
- Terraform or CloudFormation
- OpenLineage
- H3, PostGIS

Requirements

Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale
Active TS/SCI W/ Polygraph required
Bachelor's degree in Computer Science, Engineering, Finance, or a related technical field, or equivalent practical experience
Minimum of 5 years' experience with:
Apache Spark & PySpark
Advanced Python skills (including Pandas & NumPy)
Docker, Podman
AWS S3, Lambda & Step functions
SQL (with Trino)
NoSQL, DynamoDB
Unity Catalog OSS, Apache Polaris
Terraform or CloudFormation
OpenLineage
H3, PostGIS

Responsibilities

We are seeking a Data & Software Engineer works with a small team to build complex data flows for a custom application
Building end-to-end data pipelines leveraging Python
Using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs
Containerizing and deploying applications in cloud environments like AWS
Working with MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads
Leveraging industry standard tools for code control (Git, IaaC control, etc.)
Working with data catalogs, tracking data lineage and handling a variety of data formats, including Geospatial
Using Bash scripting for automation and data processing tasks
Integrating Al/ML services and models
Work with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight
Leverage strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks
Leverage a background in large-scale data migration or platform modernization efforts
Contribute to data engineering documentation, best practices, and design patterns

Benefits

Apache Iceberg, Airflow, etcApache Superset

Skills

AWS LambdaAWS S3AWS Step FunctionsAirflowAl/MLApache IcebergApache PolarisApache SparkApache SupersetBashCloudFormationDockerDynamoDBGitH3IaaCJavaMySQLNoSQLOpenLineagePandasPostGISPostgreSQLPodmanPythonPySparkSQLTerraformTrinoUnity Catalog OSS

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Data & Software Engineer

About the role

Overview

What will you do?

Do you have what it takes?

Requirements

Responsibilities

Benefits

Skills

Similar roles

Customer Engineer III, Outcome, Google Cloud Platform

Senior IT Product Manager

AI / Agentic AI Developer

Don't send a generic resume