Y
Senior AI/ML Solutions Architect
Yochana
Remote · Canada Full-time Senior 1mo ago
About the role
About
As a Data Architect (Custody Domain), you will design and lead the implementation of a high-performance, event-driven data ecosystem. You will serve as the technical authority on the Cloudera Data Platform (CDP), with a heavy focus on Kafka-based streaming and Cloud-native architectures. Your role is to bridge real-time data flows from custody operations—such as trade settlements and cash movements—into resilient microservices, data pipelines, and data marts across hybrid and multi-cloud environments.
Key Responsibilities
Event-Driven Architecture
- Architect enterprise-grade streaming solutions using Apache Kafka as the central event bus to decouple producers and consumers across the custody lifecycle.
Cloud Strategy & Migration
- Design and oversee the deployment of data workloads across Public, Private, and Hybrid Cloud environments, ensuring high availability, disaster recovery, and cost-optimization.
Real-Time Processing
- Build and tune Apache Flink and Spark Streaming jobs to process Kafka streams for real-time fraud detection, automated regulatory reporting, and continuous transaction monitoring.
Data Ingestion & Orchestration
- Design scalable, automated ingestion frameworks to move data from legacy custody systems into the CDP ecosystem, ensuring data integrity and low-latency delivery.
Microservices Strategy
- Lead the design of data-centric microservices that interact with Kafka for event sourcing and asynchronous communication in a containerized cloud environment.
Data Mart Design
- Develop performant data marts and reporting layers that provide actionable insights to business stakeholders and regulatory bodies using CDP’s modern warehouse engines.
Security & Governance
- Implement centralized security and governance through Cloudera SDX, ensuring strict compliance with financial regulations across all cloud storage and compute layers.
Technical Qualifications
Cloudera Mastery
- Expert-level knowledge of the Cloudera Data Platform (CDP) stack and its integration within cloud-native infrastructures
Kafka Expertise
- Advanced skills in Kafka cluster planning, topic management, partitioning strategies, and performance tuning (e.g., exactly-once delivery, back-pressure handling).
Cloud Proficiency
- Deep experience in architecting data solutions on major Cloud Service Providers, focusing on managed compute, object storage, and networking security.
Stream Processing Engines
- Strong proficiency in Apache Spark (Streaming/Batch) and working knowledge of Apache Flink.
Infrastructure as Code
- Familiarity with containerization (Docker/Kubernetes) and automated deployment tools to manage data services at scale
Skill Requirements
- Big data
- Databricks engineer
- Kafka
- Cloudera
- Docker
- Kubernetes
Must Have Skills
- Apache Kafka
- Python
- Apache Spark
- MySQL
- Machine Learning and Statistical modeling
Good to have Skills
- Big data
Other Requirements
- Recommended certifications: TensorFlow Developer Certificate
- AWS Certified Machine Learning – Specialty
- Databricks Certified Data Engineer Professional (optional but valuable)
Skills
Apache FlinkApache KafkaApache SparkClouderaDatabricksDockerKubernetesMachine LearningMySQLPython
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free