Data Architect / Data Engineering Lead
ATTAINX INC
About the role
About the Role
The Data Architect / Data Engineering Lead provides technical leadership for data architecture, data engineering, database modernization, and AI/ML enablement across the NRCS IT ecosystem. This role is responsible for guiding the transformation of legacy data platforms – including monolithic SQL Server environments, SSIS-based ETL pipelines, and tightly coupled cross-database dependencies – into scalable, cloud-native architectures on AWS. The position works in close coordination with the Enterprise Lead Architect, Government Program Managers, and cross-functional delivery teams to execute data management, modernization, and operational sustainment activities under the OMNI contract.
Responsibilities
Data Architecture and Strategy
- Define and maintain data architecture standards, patterns, and governance practices across all NRCS systems, ensuring alignment with FPAC’s Technical Guidance Framework (TGF), Cloud Memo directives, and Zero Trust principles.
- Lead conceptual and logical decomposition of monolithic database structures (e.g., NPAD) into domain-aligned, modular schemas that support incremental modernization and cloud migration.
- Architect service-layer data access patterns to replace direct cross-database queries and business logic embedded in stored procedures, reducing architectural fragility and enabling decoupled deployments.
- Design and maintain data models for enterprise soil data systems including NASIS, Soil Data Warehouse (SDW), Soil Data Marts (SDM), and related spatial/tabular datasets.
- Align supported systems with USDA’s cloud-native Lakehouse Data Strategy, including adoption of Databricks as the departmental standard data integration tool and elimination of duplicated data copies.
- Register and maintain schemas, interfaces, and metadata in AWS DataZone (or Government-directed metadata tooling), ensuring synchronization across environments.
Data Engineering and Pipeline Development
- Design, build, and maintain end-to-end data engineering pipelines using AWS-native services (Glue, EMR/Spark, Lambda, Step Functions, EventBridge, DMS, S3, RDS/Aurora PostgreSQL) for batch, streaming, geospatial, and near-real-time workloads.
- Modernize legacy SSIS-based ETL/ELT pipelines to cloud-native equivalents (AWS Glue, Databricks, PySpark), improving scalability, maintainability, and operational efficiency.
- Build and operate AWS DMS full-load and CDC pipelines to support migration of SQL Server databases to PostgreSQL/PostGIS and other target platforms.
- Implement Delta Lake standards, partitioning strategies, and performance tuning across ingestion frameworks for structured, unstructured, and geospatial data.
- Develop serverless orchestration workflows using Lambda, EventBridge, and Step Functions for event-driven processing and automated data operations.
- Implement data quality controls (validation, reconciliation, monitoring) and maintain audit-ready evidence of data management activities.
Database Operations and Modernization
- Provide senior-level DBA support for SQL Server clusters (including high-availability configurations, failover groups, and large-scale datasets exceeding 50 TB), as well as PostgreSQL/PostGIS, Aurora, and DynamoDB environments.
- Lead database schema versioning, change tracking, and deployment automation using Liquibase and Government-approved CI/CD processes.
- Execute database modernization activities including re-platforming from on-premises SQL Server to AWS RDS/Aurora, decoupling monolithic database dependencies, and eliminating cross-database stored procedure calls.
- Develop and maintain application-specific database recovery runbooks, including validated restore procedures, dependency mapping, and configuration baselines aligned with DR/COOP requirements.
AI/ML and Generative AI Enablement
- Design and implement AI/ML and Generative AI solutions using AWS services (Bedrock, SageMaker, OpenSearch) to support natural-language-to-SQL, automated metadata generation, conversational technical assistance, and AI-powered data pipeline optimization.
- Apply GenAI tooling (e.g., Bedrock, LangChain, embeddings, RAG patterns) to accelerate documentation, schema analysis, and DevOps workflows.
- Support AI-assisted analysis to detect redundant data flows, schema drift, and opportunities to simplify data integrations.
- Leverage AI-enabled platforms (e.g., Rhino.ai or equivalent) for legacy system discovery, business logic extraction, and modernization acceleration where authorized by the Government.
AWS Migration Support
- Provide data engineering and DBA expertise in support of the urgent AWS migration from DISC data centers, including troubleshooting, testing, and implementing operational adjustments to maintain continuity of mission-critical business functions (e.g., payment processing).
- Support full on-premises to AWS migration for databases and data infrastructure, including provisioning, lift-and-shift, re-architecture, data migration validation, and issue resolution.
- Design and execute data migration and transformation activities, including test data management and privacy-preserving techniques for non-production environments.
Governance, Compliance, and Knowledge Transfer
- Maintain audit-ready documentation for all data architecture decisions, schema changes, pipeline configurations, and modernization artifacts in Government-designated systems of record.
- Enforce FPAC architectural principles, secure coding standards, and NIST SP 800–53 controls across all data engineering and database activities.
- Conduct architecture reviews, design assurance gates, and code reviews for data-related deliverables, ensuring adherence to quality standards and FPAC SonarQube thresholds.
- Deliver knowledge transfer sessions to Government personnel and incoming vendors during transition periods, including complete documentation handoff of data systems, pipelines, and architectural decisions.
- Maintain and update troubleshooting playbooks, runbooks, and knowledge articles for data systems in Government-designated repositories.
Qualifications
Required Qualifications
- 7 years of progressive experience in data architecture, data engineering, and database administration across enterprise environments.
- 5+ years of hands-on experience designing and deploying data solutions on AWS, including direct experience with S3, Glue, EMR/Spark, Lambda, Step Functions, DMS, RDS (PostgreSQL, Aurora), DynamoDB, OpenSearch, and Lake Formation.
- Deep expertise in Microsoft SQL Server and PostgreSQL/PostGIS.
- Experience with database decoupling and monolithic database decomposition.
- Proven experience building production data pipelines for batch, streaming, and geospatial workloads.
- Experience modernizing legacy ETL (SSIS) to cloud-native frameworks.
- Strong proficiency in SQL/T-SQL, Python, and PySpark.
- Working knowledge of Bash/PowerShell for automation.
- Demonstrated ability to design and implement enterprise data architectures including data warehouses, data lakes, lakehouses (Delta Lake), and service-layer integration patterns.
- 3+ years of experience supporting federal IT programs, with familiarity with FISMA, NIST RMF, ATO processes, and federal change management requirements.
- Experience with CI/CD pipelines, Git-based version control, Terraform or CloudFormation, Liquibase, and automated quality/security gates.
- Experience working within SAFe Agile or equivalent iterative delivery frameworks, including backlog management in Jira.
Preferred Qualifications
- Direct experience with USDA systems
- Experience with FPAC IT governance, the Technical Guidance Framework (TGF), and FPAC CI/CD pipeline standards.
- Hands-on experience with AWS Bedrock, SageMaker, and Generative AI patterns (RAG, embeddings, natural-language-to-SQL, LangChain).
- Experience with geospatial data engineering, including PostGIS, GeoPackage, ArcGIS WFS/WMS services, and spatial data pipelines.
- Experience with AI-enabled legacy modernization platforms (e.g., Rhino.ai or equivalent).
- Azure experience (Synapse, ADF, ADLS, Azure ML Studio, Databricks on Azure) as a complement to primary AWS focus.
- Relevant certifications: AWS Solutions Architect, AWS Data Analytics Specialty, Azure Data Engineer Associate (DP–203), or equivalent.
- Master’s degree in Computer Science, Data Science, or related filed.
About Us
AttainX Inc. is a Women Owned Small Business (WOSB), Economically Disadvantaged WOSB (EDWOSB), CMMC Level 2, CMMI Level 3, ISO 9001:2015 certified QMS and Silver Level SAFe Partner. For more than 15 years, AttainX, Inc. has delivered emergent technologies, software products, and high-quality services that meet the needs of our Federal Government customers.
The last 4 years have shown significant company growth as we have increased our contracts portfolio and hold the “Best in Class” contract vehicles, GSA MAS and OASIS Small Business and 8(a) Pools 1, 2 and 3. In addition, we are prime on several Agency Specific IDIQs and BPAs with the National Oceanic and Atmospheric Administration, Department of Energy, Navy, Health and Human Service, USCIS and the Defense Intelligence Agency.
AttainX is dedicated to quality and best practices for the services we provide. We understand our people are the key ingredient to ensuring our customers Mission and Goals are met with excellence.
Benefits
- Competitive compensation and benefits packages including paid vacation, medical, dental, vision, matching 401K plan, tuition/training reimbursement, and Long & Short-Term Disability.
EEO Commitment
AttainX is an equal employment opportunity employer, committed to providing a workplace free from discrimination based on Title VII of the Civil Rights Act, VEVRAA and Section 503, or other status protected by applicable federal, state, local, or international law. These protections also extend to applicants.
Accommodations
Individuals with a disability who would like to request a reasonable workplace accommodation may send an email to Human Resources indicating the specifics of the assistance needed.
Physical Demands
Sitting and working on a computer for long, continuous periods each day; effective communications by telephone, email, and face-to-face; standing, walking, and sitting; handling and feeling objects or controls; reaching; talking and hearing; lifting and/or moving up to 10 pounds; and specific vision abilities including close vision, distance vision, color vision, peripheral vision, depth perception, and the ability to adjust and focus.
Work Environment
The noise level in the work environment is usually moderate.
Equal employment opportunity, including veterans and individuals with disabilities.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free