Remote AI Evaluation Engineer Creating Challenging Coding Tests
Mindrift
About the role
About
Drive innovation by developing cutting-edge coding challenges as an AI Evaluation Engineer. Utilize your advanced skills in Python and test automation to evaluate AI systems effectively.
This project-based role is ideal for seasoned developers or software engineers with a strong background in test automation. You'll have the opportunity to create realistic coding tasks, validate AI's end-to-end functionality, and analyze AI behaviors. Your expertise in full-stack development, particularly with React and back-end systems, will be crucial.
Responsibilities
- Design and refine coding test cases from production codebases
- Write comprehensive functional tests validating real-world scenarios
- Create challenging tasks requiring complex reasoning
- Analyze AI failures and evolution for improvement
- Iterate based on feedback from QA reviewers
Requirements
- 5+ years of software development experience
- Strong skills in Python, including pytest and async/await
- Experience in full-stack development with React
- Familiarity with Docker and CI/CD processes
- English proficiency at B2 level
Additional Information
Leverage your software development expertise to challenge and enhance AI systems while enjoying the flexibility of part-time project work.
#J-18808-Ljbffr
Requirements
- 5+ years of software development experience
- Strong skills in Python, including pytest and async/await
- Experience in full-stack development with React
- Familiarity with Docker and CI/CD processes
- English proficiency at B2 level
Responsibilities
- Design and refine coding test cases from production codebases
- Write comprehensive functional tests validating real-world scenarios
- Create challenging tasks requiring complex reasoning
- Analyze AI failures and evolution for improvement
- Iterate based on feedback from QA reviewers
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free