Machine Learning/ AI Engineer
Thri5 Inc.
About the role
Thri5 is the AI-powered System of Actions for the modern retailer. Despite massive investments in planning, forecasting, and analytics, retailers still face the same operational issues—out-of-stocks, bad master data, margin leakage, and inconsistent execution across stores and channels. The gap isn't in intelligence; Thri5 continually scans data across the business, detects and prioritizes opportunities, evaluates impact, and orchestrates execution through both humans and AI agents. From store managers and DC leaders to category and supply chain teams, Thri5 routes the right actions to the right owners—with clear context, recommendations, and workflows—closing the gap between plan and real-world performance. Our vision is to become the trusted AI operating layer for retail execution, making every operator 10x more effective and freeing them to focus on what matters most: serving customers and growing the business. Founded by a team with deep retail and retail-technology experience, Thri5 is venture-backed by some of Canada's most prominent VC and angel investors. As an AI / Machine Learning Engineer at Thri5, you'll help build the agent layer that powers our System of Actions. You'll also develop deterministic, data-driven detection models to reliably identify operational issues and opportunities—and then layer LLM-based capabilities on top to generate high-quality alerts, recommended actions, and explanations grounded in real retail data. Design and build the core frameworks that power Thri5's AI agents: task decomposition, routing, tool calling, multi-step workflows, and human-in-the-loop escalation. Implement agents that coordinate across operators (store, DC, category, supply chain) and systems to drive real actions, not just insights. LLM-Driven Intelligence Develop and fine-tune LLM-based components to detect anomalies and opportunities that impact commercial and operational performance. Build prompt, retrieval, and grounding patterns that produce reliable behaviour in noisy, real-world data. statistical anomaly detection, rules + ML hybrids, scoring systems) to identify out-of-stocks, bad master data, and execution gaps. Build evaluation frameworks (precision/recall, false positive control, business impact, backtests) to ensure detections are trustworthy and stable in production. Data & Recommendation Pipelines Build and optimize pipelines that leverage real-time and batch customer data (transactions, inventory, operations) to power agent decisions and recommendations. Own end-to-end ML workflows—data preprocessing, feature engineering, training, evaluation, and production inference. Implement robust MLOps practices for CI/CD, experimentation, and monitoring of models and agents. Instrument and monitor agent behaviour (latency, cost, quality, safety) and continuously iterate to improve performance, accuracy, and scalability. Partner with product and engineering to translate customer problems into concrete agent capabilities and use cases. Contribute to technical decision-making and architecture as we scale the Thri5 platform. AI Fluency: 5+ years of software development experience with deep exposure to modern AI/ML, including both classical ML / data science and LLMs, GPT-style models, and agent/tool-calling ecosystems. • ML / Data Science Proficiency: Strong background in supervised/unsupervised learning and anomaly detection, with hands-on experience designing deterministic or semi-deterministic detection systems (statistical models, rules + ML, scoring). Comfortable with model evaluation, experimentation, and translating business heuristics into data-driven logic. • Programming & Frameworks: Proficient in Python and familiar with ML frameworks such as PyTorch or TensorFlow. LangChain, LlamaIndex, custom agent frameworks) and vector databases is an asset. • Data Handling: Comfortable working with large-scale datasets, complex schemas, and event-driven data. Strong SQL skills and experience building data pipelines into production systems. • Collaborative, low-ego, and comfortable working across a small, high-performing team (founders, engineers, product, and customers). • Bachelor's, Master's or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field (or equivalent practical experience).
Requirements
- 5+ years of software development experience with deep exposure to modern AI/ML
- Strong background in supervised/unsupervised learning and anomaly detection
- Proficient in Python and familiar with ML frameworks such as PyTorch or TensorFlow
- Comfortable working with large-scale datasets, complex schemas, and event-driven data
- Strong SQL skills and experience building data pipelines into production systems
- Bachelor's, Master's or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field (or equivalent practical experience)
Responsibilities
- Design and build the core frameworks that power Thri5's AI agents
- Implement agents that coordinate across operators and systems to drive real actions
- Develop and fine-tune LLM-based components to detect anomalies and opportunities
- Build prompt, retrieval, and grounding patterns that produce reliable behaviour in noisy, real-world data
- Build evaluation frameworks to ensure detections are trustworthy and stable in production
- Build and optimize pipelines that leverage real-time and batch customer data to power agent decisions and recommendations
- Own end-to-end ML workflows
- Implement robust MLOps practices for CI/CD, experimentation, and monitoring of models and agents
- Instrument and monitor agent behaviour and continuously iterate to improve performance, accuracy, and scalability
- Partner with product and engineering to translate customer problems into concrete agent capabilities and use cases
- Contribute to technical decision-making and architecture as we scale the Thri5 platform
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free