Job Description
Overview
We are seeking Arabic (Egyptian) AI Evaluation Specialists to help assess and improve the performance of advanced AI systems. In this role, you’ll contribute directly to the evaluation and enhancement of large language models (LLMs) by testing how they understand, generate, and respond to Arabic content.
You will craft realistic scenarios, analyze model outputs for quality and safety, and help ensure the technology delivers accurate, culturally appropriate, and reliable results. Your insights will play a key role in shaping smarter AI experiences.
Project Details
Language: Native fluency in Egyptian Arabic
Location: Remote-Egypt
Project Duration: 3 months
Pay Rate: $10 USD/Hour
Schedule: 40 hours a week. 8 hours per day Mon-Fri
Start Date: February 2nd
What You Will Do
- Conduct side-by-side comparisons of AI responses and rate their quality on a 1–5 scale based on established guidelines.
- Design scenario-based and edge-case prompts to evaluate model behavior, including tricky, ambiguous, or incomplete information situations.
- Assess outputs for instruction adherence, factual accuracy, tone, safety, and overall usefulness.
- Develop clear evaluation rubrics and criteria to ensure consistent scoring across tasks.
- Create reliable reference materials (articles, transcripts, reports, etc.) to serve as the source of truth for testing.
- Write well-structured “gold standard” responses that demonstrate the most accurate and helpful answer.
- Identify potential issues such as hallucinations, inconsistencies, or cultural/contextual mismatches.
Qualifications
- Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
- B2 or superior level of English.
- Native fluency in Modern Standard Arabic in Egyptian dialect.
-Strong understanding of the distinction between Fusha and ‘Ammiyya.
- Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
- Ability to work independently and manage workflows effectively in a remote environment.
Nice to Have
- Multilingual proficiency in one or more Arabic dialects.
- Strong attention to detail and critical thinking to identify hallucinations and bias
- Familiarity with data annotation platforms and model evaluation tools.
- Experience in prompt engineering, AI evaluation, linguistic QA, or translation is a plus
- Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.
Note: Please do not use VPNs or IP-masking tools during the recruitment process — our security system requires accurate regional verification.
Overview
We are seeking Arabic (Egyptian) AI Evaluation Specialists to help assess and improve the performance of advanced AI systems. In this role, you’ll contribute directly to the evaluation and enhancement of large language models (LLMs) by testing how they understand, generate, and respond to Arabic content.
You will craft realistic scenarios, analyze model outputs for quality and safety, and help ensure the technology delivers accurate, culturally appropriate, and reliable results. Your insights will play a key role in shaping smarter AI experiences.
Project Details
Language: Native fluency in Egyptian Arabic
Location: Remote-Egypt
Project Duration: 3 months
Pay Rate: $10 USD/Hour
Schedule: 40 hours a week. 8 hours per day Mon-Fri
Start Date: February 2nd
What You Will Do
- Conduct side-by-side comparisons of AI responses and rate their quality on a 1–5 scale based on established guidelines.
- Design scenario-based and edge-case prompts to evaluate model behavior, including tricky, ambiguous, or incomplete information situations.
- Assess outputs for instruction adherence, factual accuracy, tone, safety, and overall usefulness.
- Develop clear evaluation rubrics and criteria to ensure consistent scoring across tasks.
- Create reliable reference materials (articles, transcripts, reports, etc.) to serve as the source of truth for testing.
- Write well-structured “gold standard” responses that demonstrate the most accurate and helpful answer.
- Identify potential issues such as hallucinations, inconsistencies, or cultural/contextual mismatches.
Qualifications
- Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
- B2 or superior level of English.
- Native fluency in Modern Standard Arabic in Egyptian dialect.
-Strong understanding of the distinction between Fusha and ‘Ammiyya.
- Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
- Ability to work independently and manage workflows effectively in a remote environment.
Nice to Have
- Multilingual proficiency in one or more Arabic dialects.
- Strong attention to detail and critical thinking to identify hallucinations and bias
- Familiarity with data annotation platforms and model evaluation tools.
- Experience in prompt engineering, AI evaluation, linguistic QA, or translation is a plus
- Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.
Note: Please do not use VPNs or IP-masking tools during the recruitment process — our security system requires accurate regional verification.
Why Join Welo Data?
✨ Limitless Flexibility
Project-based opportunities that fit your availability. Choose when and how much you want to contribute—fully remote, with complete autonomy.
🌱 Limitless Growth
Optional access to AI and Large Language Model workshops designed specifically for professionals like you. No coding required—just your expertise.
🌍 Limitless Support
Be part of a global contributor community with responsive guidance and support.
💡 Real Impact
Apply your expertise in the Legal field to influence the AI systems shaping the future of your industry—while collaborating with data professionals and expanding your skills.
How to Apply?
Apply now by answering a few quick questions to join our database and become part of our growing community.
About Welo Data
Welo Data, part of Welocalize, is a global AI data company with 500,000+ contributors delivering high-quality, ethical data to train the world’s most advanced AI systems. We’re building smarter, more human AI with a diverse community in 100+ countries.
At Welo Data, Limitless AI. Limitless You. isn’t just a slogan—it’s our promise. We build smarter AI through the power of human contribution, offering limitless opportunities for our global community to grow, contribute, and work on their terms.
Welocalize
13 jobs posted
About the job
Posted on
Jan 23, 2026
Apply before
Feb 22, 2026
Job typeFull-time
Salary Range
$10/hr
CategoryOther AI jobs
Location
Cairo, EG
Similar Jobs
Welocalize
22 days agoPictor | Arabic (Egyptian) AI Evaluation Specialists
Cairo, EG$10/hrView detailsWelocalize
13 days agoAlpha Pictoris | Arabic (Levantine) AI Evaluation Specialist
Cairo, EG$10/hrView detailsWelocalize
13 days agoAlpha Pictoris | Arabic (Gulf) AI Evaluation Specialist
Cairo, EG$10/hrView detailsWelocalize
22 days agoPictor | Arabic (Levantine) AI Evaluation Specialist
Cairo, EG$10/hrView detailsWelocalize
22 days agoPictor | Arabic (Gulf) AI Evaluation Specialist
Cairo, EG$10/hrView details
Mistral
15 days agoApplied AI, Evaluation Engineer
View detailsAmazon
25 days agoDirector, AI SA GTM Specialists
US, WAView detailsWaymo
21 days agoDirector of AI Foundations, Foundation Model Evaluation & Integration
Mountain View, CASan Francisco, CA$332K - $421K/yrView detailsInvisible
16 days agoArabic Language Specialist (Oman) - Freelance AI Trainer Project
OM$6 - $65/hrView detailsInvisible
16 days agoArabic Language Specialist (Jordan) - Freelance AI Trainer Project
JO$6 - $65/hrView details
Looking for something different?
Browse all AI jobs