Freelance Agent Evaluation Engineer

12 minuti fa


WorkFromHome, Italia Mindrift A tempo pieno

Overview Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment. What This Opportunity Involves While each project involves unique tasks, contributors may: Create structured test cases that simulate complex human workflows Define gold-standard behavior and scoring logic to evaluate agent actions Analyze agent logs, failure modes, and decision paths Work with code repositories and test frameworks to validate your scenarios Iterate on prompts, instructions, and test cases to improve clarity and difficulty Ensure that scenarios are production-ready, easy to run, and reusable What We Look For This opportunity is a good fit for software engineers, open to part-time, non-permanent projects. Ideally, contributors will have: 3+ of software development experience with strong Python focus Experience with Git and code repositories Comfortable with structured formats like JSON/YAML for scenario description Understanding core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design Familiarity with Docker English proficiency - B2 How It Works Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid Project time expectations Tasks for this project are estimated to take 6-10 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.



  • WorkFromHome, Italia Mindrift A tempo pieno

    This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What We Do...


  • WorkFromHome, Italia Mindrift A tempo pieno

    1 week ago Be among the first 25 applicants This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human...


  • WorkFromHome, Italia Mindrift A tempo pieno

    1 day ago Be among the first 25 applicants Overview This opportunity is for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human...


  • WorkFromHome, Italia Mindrift A tempo pieno

    Overview 1 day ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation...


  • WorkFromHome, Italia Mindrift A tempo pieno

    A dynamic tech company is seeking mid-senior level Python engineers for a part-time, remote project focused on developing Model Context Protocol servers. Ideal candidates will have over 4 years of experience in Python development, particularly in backend or tool development. Successful applicants will help build and integrate evaluation servers, implement...


  • WorkFromHome, Italia Mindrift A tempo pieno

    A dynamic AI innovation firm is seeking a candidate to create and design evaluation scenarios for LLM-based agents. This role involves defining test cases to simulate human tasks and evaluating agent actions against predefined behaviors. Ideal candidates should hold relevant degrees, possess strong analytical skills, and have experience in QA or data...


  • WorkFromHome, Italia HRSpecialist Italia A tempo pieno

    Una società di consulenza energetica in Italia cerca agenti commerciali freelance per promuovere soluzioni rinnovabili come fotovoltaico e pompe di calore. Gli agenti saranno responsabili della gestione dei clienti e del loro supporto fino all'attivazione delle soluzioni. Offriamo provvigioni competitive, formazione continua e possibilità di crescita. Sono...

  • Autonomous AI Agent QA

    17 minuti fa


    WorkFromHome, Italia Mindrift A tempo pieno

    A leading AI consultancy is seeking QA professionals for a part-time, remote opportunity focused on validating AI agent evaluations. Candidates should possess strong analytical skills and attention to detail, with the ability to assess complex systems. The role emphasizes flexible project-based work, making it ideal for students or those seeking freelance...


  • WorkFromHome, Italia HRSpecialist Italia A tempo pieno

    Un'azienda specializzata in energia sta cercando un Agente Commerciale Freelance per promuovere soluzioni energetiche rinnovabili in Sardegna. Il candidato ideale avrà esperienza commerciale, ottime doti comunicative e sarà in grado di gestire l'intero processo con i clienti, dalla consulenza iniziale fino all'attivazione delle soluzioni. Offriamo...


  • WorkFromHome, Italia HRSpecialist Italia A tempo pieno

    Un'agenzia commerciale cerca agenti di commercio freelance per promuovere soluzioni energetiche rinnovabili in Italia. Sarà necessario gestire i clienti e guidarli nella scelta delle migliori opzioni disponibili, lavorando in un settore in crescita e con un alto potenziale di guadagno. Offriamo provvigioni competitive, supporto e formazione continua per...