Evaluation Scenario Writer
38 minuti fa
Remoto EUR 30.000 - 50.000 2 giorni fa Overview Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This Opportunity Involves While each project involves unique tasks, contributors may: Create structured test cases that simulate complex human workflows Define gold-standard behavior and scoring logic to evaluate agent actions Analyze agent logs, failure modes, and decision paths Work with code repositories and test frameworks to validate your scenarios Iterate on prompts, instructions, and test cases to improve clarity and difficulty Ensure that scenarios are production-ready, easy to run, and reusable What We Look For This opportunity is a good fit for software engineers, open to part-time, non-permanent projects. Ideally, contributors will have: 3+ of software development experience with strong Python focus Experience with Git and code repositories Comfortable with structured formats like JSON/YAML for scenario description Understanding core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design Familiarity with Docker How It Works Project time expectations Tasks for this project are estimated to take 6-10 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted. Payment Paid contributions, with rates up to $30/hour* Fixed project rate or individual rates, depending on the project Some projects include incentive payments Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project #J-18808-Ljbffr
-
Evaluation Scenario Writer
42 minuti fa
WorkFromHome, Italia Mindrift A tempo pienoAt Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do The Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real‑world expertise from across the globe....
-
Evaluation Scenario Writer
3 settimane fa
WorkFromHome, Italia Mindrift A tempo pienoThis opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do...
-
Remote AI Evaluation
37 minuti fa
WorkFromHome, Italia Logista A tempo pienoA leading tech company is looking for software engineers for remote, project-based opportunities focused on AI evaluation and development. Ideal candidates have over 3 years of software development experience with a strong focus on Python, and are familiar with tools like Git and Docker. This is part-time, non-permanent work where tasks can be completed...
-
Remote AI Evaluation Scenario Architect
42 minuti fa
WorkFromHome, Italia Mindrift A tempo pienoA cutting-edge AI consultancy is seeking an Entry-Level Tester to design evaluation scenarios for LLM-based agents. Responsibilities include creating structured test cases to simulate human workflows, ensuring clarity and effectiveness of scenarios, and analyzing agent behaviors. This flexible part-time freelance role allows you to work around your academic...
-
Remote AI Evaluation Scenarios Writer — Entry Level QA
3 settimane fa
WorkFromHome, Italia Mindrift A tempo pienoA leading AI innovation company is seeking an entry-level candidate to create structured evaluation scenarios for LLM-based agents. This part-time role requires a Bachelor's degree in a relevant field and a passion for AI. Candidates will design test cases to simulate human tasks, analyze agent behavior, and work with code repositories. Flexible schedule and...
-
Freelance Agent Evaluation Engineer
4 minuti fa
WorkFromHome, Italia Mindrift A tempo pienoOverview Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This Opportunity Involves While each project involves unique tasks, contributors may: Create structured test cases that simulate complex...
-
Freelance Agent Evaluation Analyst
4 settimane fa
WorkFromHome, Italia Mindrift A tempo pieno1 week ago Be among the first 25 applicants This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human...
-
AI Agent Evaluation Analyst
9 minuti fa
WorkFromHome, Italia Mindrift A tempo pieno1 day ago Be among the first 25 applicants Overview This opportunity is for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human...
-
AI Agent Evaluation Analyst
9 minuti fa
WorkFromHome, Italia Mindrift A tempo pienoOverview 1 day ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation...
-
Freelance Agent Evaluation Engineer
7 minuti fa
WorkFromHome, Italia Mindrift A tempo pienoThis opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do...