Skip to content
mimi

Remote AI Evaluation Engineer Creating Challenging Coding Tests

Mindrift

Canada · On-site Full-time Senior Yesterday

About the role

About

Drive innovation by developing cutting-edge coding challenges as an AI Evaluation Engineer. Utilize your advanced skills in Python and test automation to evaluate AI systems effectively.
This project-based role is ideal for seasoned developers or software engineers with a strong background in test automation. You'll have the opportunity to create realistic coding tasks, validate AI's end-to-end functionality, and analyze AI behaviors. Your expertise in full-stack development, particularly with React and back-end systems, will be crucial.

Responsibilities

  • Design and refine coding test cases from production codebases
  • Write comprehensive functional tests validating real-world scenarios
  • Create challenging tasks requiring complex reasoning
  • Analyze AI failures and evolution for improvement
  • Iterate based on feedback from QA reviewers

Requirements

  • 5+ years of software development experience
  • Strong skills in Python, including pytest and async/await
  • Experience in full-stack development with React
  • Familiarity with Docker and CI/CD processes
  • English proficiency at B2 level

Additional Information

Leverage your software development expertise to challenge and enhance AI systems while enjoying the flexibility of part-time project work.
#J-18808-Ljbffr

Requirements

  • 5+ years of software development experience
  • Strong skills in Python, including pytest and async/await
  • Experience in full-stack development with React
  • Familiarity with Docker and CI/CD processes
  • English proficiency at B2 level

Responsibilities

  • Design and refine coding test cases from production codebases
  • Write comprehensive functional tests validating real-world scenarios
  • Create challenging tasks requiring complex reasoning
  • Analyze AI failures and evolution for improvement
  • Iterate based on feedback from QA reviewers

Skills

Pythonpytestasync/awaitReactDockerCI/CDtest automation

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free