Ara Zeta Project - AI Safety Evaluator - Malay (Singapore)
Welo Data (Welocalize) · Singapore · Remote
- Source
- welo-data
About this role
About Welo Data Welo Data, a Welo Global brand, is the multilingual data and evaluation partner for foundation labs and enterprises deploying GenAI systems globally. They deliver the human judgment, data infrastructure, and evaluation systems that ensure AI models perform reliably across languages, cultures, and real-world contexts, at every stage from training through deployment. Its global network of 500,000+ vetted experts spans 300+ languages and locales, enabling high-quality multilingual data creation and structured model evaluation across the full spectrum of modern AI applications — from large language models and voice and speech systems to agentic workflows and robotics and embodied AI. This breadth of linguistic, cultural, and domain expertise enables Welo Data to address critical AI development challenges, including safety, bias, inclusivity, and cross-lingual reliability. A unified global operating model, led by specialized program and quality experts and grounded in assessment-driven talent selection, localized rubrics, and continuous calibration, ensures consistent performance across languages, domains, and modalities. Underpinning all of this is NIMO™ (Network Identity Management and Operations), Welo Data's proprietary identity and fraud-prevention framework. Built to maintain data integrity and workforce trust across a global contributor base, NIMO combines advanced verification, continuous monitoring, and structured QA to ensure every dataset is accurate, traceable, and culturally grounded. welodata.ai Project Overview We are seeking experienced bilingual evaluators to support a multilingual AI safety project focused on evaluating model responses across culturally specific prompt-image datasets. This project involves applying a structured safety rubric to assess AI-generated responses for appropriateness, safety, and reliability within the target locale’s cultural context. Each language stream will process approximately 1,000 prompt-image pairs. Every item will receive two independent evaluations, with arbitration applied in cases of disagreement. Evaluations will primarily be documented in English, with a defined in-language sample. Project Details Location: Remote – Singapore Team: Welo Data – AI Services Engagement Type: Freelance – Remote Start Date: As soon as possible Duration: 2-3 weeks Weekly Commitment: 20–40 hours per week Schedule Options: • 4 hours per day, Monday–Friday OR • 2 hours per day, Monday–Friday + 10 weekend hours Hourly rate: 28 USD Responsibilities - Evaluate AI-generated responses using a structured safety rubric - Complete two independent evaluations per item - Provide concise, well-structured rationales in English - Participate in calibration sessions - Support arbitration when evaluation discrepancies occur - Maintain quality and throughput targets during the evaluation window Qualifications - Fluency in the target language and English - Deep cultural understanding of the target locale - Strong written English skills for documentation and rationales - Prior experience in safety evaluation, policy review, content moderation, or rubric-based assessment preferred - Ability to apply detailed guidelines consistently - Strong analytical skills and attention to nuance - Reliable availability during the production window - Priority may be given to contributors who previously worked on prompt-image or similar evaluation projects to reduce onboarding time and maintain continuity. Disclaimer: This role involves working with explicit and sensitive content. Applicants should be comfortable working with adult material in a professional capacity. Please apply only if you fully understand and are prepared for the nature of this role.
Skills & domains
- ai-training
- linguistics
- expert
