All jobs · Machine Learning Engineer jobs

AI Output Tester

Blue Oak Consulting

Remote (Global) Full-time Entry Level 3w ago

About the role

About Blue Oak Consulting

Blue Oak Consulting provides economic advice where the solution is rarely obvious. We work with leadership teams to peel back the layers of strategy talk and focus on the actual numbers that drive a business. Our team remains skeptical by nature. We look closely at pricing models, capital allocation, and internal costs to find where value is leaking or where capital is trapped. We believe that most commercial questions arrive wrapped in language that obscures the real choices. Our job is to surface what the numbers are hiding and test whether assumptions actually hold under pressure.

The role

Many organizations are currently rushing to integrate automation and large language models into their workflows without a clear system to verify the results. The problem is that AI can produce text that feels authoritative but lacks logical consistency or factual truth. This creates a significant risk for firms that rely on these tools for high stakes decision making. We created the AI Output Tester position to solve this tension. You will act as the final check on model responses, ensuring that the work we provide to clients is grounded in reality rather than just plausible sounding sentences.

What you will do

Review large volumes of text and financial data generated by AI models to identify errors in calculation or reasoning.
Compare automated outputs against internal spreadsheets and external data sources for total accuracy.
Log patterns where models fail or invent information to help our technical team refine their approach.
Edit model responses to strip out unnecessary adjectives and align with our firm's direct, professional voice.
Experiment with various prompting techniques to see which instructions produce the most consistent results.
Work with our consulting staff to verify that the automated portions of our client reports are completely logical.

What we need from you

A strong ability to read dense documents and notice when a minor detail contradicts a previous statement.
Basic proficiency with numbers and the capacity to perform quick calculations to check if the text matches the data.
A naturally skeptical outlook on technology and its current limitations.
The self discipline required to work in a fully remote environment without constant oversight.
Clear and direct writing skills with a focus on simple sentence structures and precise vocabulary.
University level analytical training in a field like history, economics, math, or philosophy is helpful, though no specific degree is required for this entry level role.

Helpful background

While this is an entry level role, we appreciate candidates who have experience working with data or text in a structured way. This could include previous internships in research, accounting, or technical editing. If you have spent time tinkering with large language models on your own and have noticed their tendency to fail at basic logic, that perspective will be useful. We do not require specialized coding skills, but an understanding of how to structure a logical query is a significant advantage in this role.

What working here looks like

Blue Oak Consulting is a fully remote firm that operates without the standard noise of a corporate office. We do not have long, aimless meetings or performance for the sake of appearances. Instead, we focus on producing work that is actually true and useful for our clients. Communication is mostly written and direct. We expect everyone to be able to explain their reasoning clearly and to accept a thorough critique of their work. It is a quiet, disciplined environment that prioritizes evidence over status.

What the role offers

This is a full time, permanent position that allows you to see how high level commercial advice is constructed from the ground up. You will learn how to dissect business models and how to separate marketing fluff from economic reality. Because we are a small firm, you will see the immediate impact of your work on our final deliverables. You will also develop a realistic understanding of AI capabilities that goes far beyond the current industry chatter. We offer a stable, professional setting where you can build your analytical skills without the distractions found in many modern companies.

Who tends to do well here

Successful testers at our firm are individuals who enjoy finding the flaw in an argument. They are people who are not easily impressed by sophisticated language and who always ask for the evidence behind a claim. This role requires patience and a high level of concentration, as the work is often repetitive and requires checking the same types of variables across many documents. Those who thrive here are people who find a quiet kind of satisfaction in spotting a mistake that everyone else missed. We value clear thinking and the courage to report a problem exactly as it is.

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

AI Output Tester

About the role

About Blue Oak Consulting

The role

What you will do

What we need from you

Helpful background

What working here looks like

What the role offers

Who tends to do well here

Similar roles

Accountant Trainee

Data Scientist/Engineer

Principal Information Security Systems Engineer (ISSE)

Don't send a generic resume