Program Manager, AI Model Evaluation, International Seller Growth
Posted 6 days ago
Job Description
Join the Seller AI team where you'll lead benchmarking and evaluation of AI models that enhance the seller experience across Amazon's global marketplace. You'll manage a team dedicated to validating, testing, and improving Artificial Intelligence (AI) and Large Language Models (LLMs) that power innovative seller tools. This role combines strategic leadership with hands-on technical oversight, requiring exceptional communication skills, team management, and stakeholder engagement.
In this position, you'll drive the development and implementation of comprehensive benchmarking methodologies to evaluate AI model performance across accuracy, robustness, bias, and reliability metrics. Your expertise will be crucial in translating technical findings into actionable insights that improve Seller Assistant's performance and contribute to the growth of Amazon's seller community worldwide.
Key job responsibilities
1. Plan and execute benchmarking exercises for AI models, defining test plans, metrics, and acceptance criteria across accuracy, robustness, bias, and reliability dimensions.
2. Lead a team responsible for validating data based on specific annotation guidelines, ensuring accuracy and quality while escalating potential regulatory risks
3. Prepare comprehensive audit and benchmarking reports, including error ratings, root-cause analysis, and recommendations for senior stakeholders
4. Drive process efficiencies and explore automation opportunities to enhance the productivity of data generation initiatives
5. Mentor team members and help develop their skills while managing overall schedules, proactively mitigating risks, and keeping project scope under control.
A day in the life
Your day begins with prioritizing benchmarking tasks for your team based on current business requirements. You'll review progress on ongoing audits, provide guidance where needed, and ensure deliverables meet quality standards. Throughout the day, you'll analyze performance data from Seller Assistant tools, identifying patterns and insights that could improve AI model effectiveness. You'll wrap up by preparing detailed presentations that showcase your team's findings and recommendations to leadership, highlighting opportunities to enhance the seller experience.
About the team
The Seller AI team within International Seller Services is dedicated to creating Gen-AI/LLM powered tools and agentic solutions that accelerate business growth for Amazon sellers worldwide. We focus on handling annotations for training, measuring, and improving Artificial Intelligence and Large Language Models to deliver superior seller experiences. Our team combines technical expertise with a deep understanding of seller needs to develop innovative solutions that simplify complex tasks and drive business growth.
By joining us, you'll play a pivotal role in shaping the future of selling on Amazon, working with advanced AI technologies that directly impact millions of sellers across global marketplaces.
In this position, you'll drive the development and implementation of comprehensive benchmarking methodologies to evaluate AI model performance across accuracy, robustness, bias, and reliability metrics. Your expertise will be crucial in translating technical findings into actionable insights that improve Seller Assistant's performance and contribute to the growth of Amazon's seller community worldwide.
Key job responsibilities
1. Plan and execute benchmarking exercises for AI models, defining test plans, metrics, and acceptance criteria across accuracy, robustness, bias, and reliability dimensions.
2. Lead a team responsible for validating data based on specific annotation guidelines, ensuring accuracy and quality while escalating potential regulatory risks
3. Prepare comprehensive audit and benchmarking reports, including error ratings, root-cause analysis, and recommendations for senior stakeholders
4. Drive process efficiencies and explore automation opportunities to enhance the productivity of data generation initiatives
5. Mentor team members and help develop their skills while managing overall schedules, proactively mitigating risks, and keeping project scope under control.
A day in the life
Your day begins with prioritizing benchmarking tasks for your team based on current business requirements. You'll review progress on ongoing audits, provide guidance where needed, and ensure deliverables meet quality standards. Throughout the day, you'll analyze performance data from Seller Assistant tools, identifying patterns and insights that could improve AI model effectiveness. You'll wrap up by preparing detailed presentations that showcase your team's findings and recommendations to leadership, highlighting opportunities to enhance the seller experience.
About the team
The Seller AI team within International Seller Services is dedicated to creating Gen-AI/LLM powered tools and agentic solutions that accelerate business growth for Amazon sellers worldwide. We focus on handling annotations for training, measuring, and improving Artificial Intelligence and Large Language Models to deliver superior seller experiences. Our team combines technical expertise with a deep understanding of seller needs to develop innovative solutions that simplify complex tasks and drive business growth.
By joining us, you'll play a pivotal role in shaping the future of selling on Amazon, working with advanced AI technologies that directly impact millions of sellers across global marketplaces.
Amazon
129 jobs posted
About the job
Posted on
Mar 19, 2026
Apply before
Apr 18, 2026
Job typeFull-time
CategoryAI Internships
Location
China
Skills
Similar Jobs
6d
AI Benchmarking Spec. - Chinese, International Seller Growth
Amazon
ChinaAI Benchmarking Spec. - Chinese, International Seller Growth
Amazon
China6d19d
Manager, Internal AI Enablement & Agentic Systems
HubSpot
Remote$139K - $222KUnited StatesManager, Internal AI Enablement & Agentic Systems
HubSpot
Remote$139K - $222KUnited States19d19d
Summernaut Program - AI & Management Consulting (Value Engineering) Summer Intern
Celonis
United StatesSummernaut Program - AI & Management Consulting (Value Engineering) Summer Intern
Celonis
United States19d5d
LATAM Internship Program - Experience Design (UX/UI) – AI & Salesforce
Salesforce
BrazilLATAM Internship Program - Experience Design (UX/UI) – AI & Salesforce
Salesforce
Brazil5d5d
LATAM Internship Program | Salesforce Development Intern (AI & Agentforce)
Salesforce
BrazilLATAM Internship Program | Salesforce Development Intern (AI & Agentforce)
Salesforce
Brazil5d8d
AI PM Intern
OpusClip
$25Palo AltoAI PM Intern
OpusClip
$25Palo Alto8d19d
AI Software Archtiect Intern
d-Matrix
Santa ClaraAI Software Archtiect Intern
d-Matrix
Santa Clara19d7d
Undergraduate Intern -- AI Engineering
Dell Technologies
Singapore, SingaporeUndergraduate Intern -- AI Engineering
Dell Technologies
Singapore, Singapore7d16d
Senior Applied Scientist, Alexa International
Amazon
US, WASenior Applied Scientist, Alexa International
Amazon
US, WA16d14d
Summer 2026 Intern - AI Research
Salesforce
$49 - $68California - Palo AltoSummer 2026 Intern - AI Research
Salesforce
$49 - $68California - Palo Alto14d
Looking for something different?
Browse all AI jobsFree AI job alerts
Get the latest AI jobs delivered to your inbox every week. Free, no spam.