Technical AI Policy Researcher, Model Behaviour - Trust and Safety
Posted 1 day ago
Job Description
The Trust & Safety (T&S) Responsible AI Policy team's mission is to ensure the development of GenAI models and applications are safe, fair and trustworthy. We do this by defining, measuring and mitigating safety and fairness AI model risks through policy frameworks, model risk assessments, and upstream policy solutions.
The T&S Responsible AI Policy team sits within the T&S GenAI and Emerging Products pillar. We work closely with Trust & Safety teams (product policy, product, engineering, data science, operations, red teaming), business and model teams, and cross-functional stakeholders (comms, legal, public policy) across global markets. Success in this team requires strong policy acumen, judgment, creativity, analytical rigour, and the ability to translate Generative AI risk to different stakeholders effectively.
As an AI Policy Researcher on the T&S Responsible AI Policy team, you will champion the responsible development and deployment of our frontier AI models across multiple businesses with a specialty on model bias, political risk, and model behaviour. You will accelerate technical policy research, incubate new research efforts, and drive end-to-end policy to evaluate workflows for your domain areas.
Responsibilities:
- Design and maintain multimodal GenAI policies across safety-relevant domains, including political and ideological bias, deceptive misuse, manipulation and persuasion, and fairness.
- Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards.
- Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes.
- Build policy artifacts that support model training, evaluation, and deployment. Partner with safety researchers, engineers, product teams, and other stakeholders to operationalize policy into scalable model behavior and measurable safeguards.
- Design end-to-end policy development to pre-launch evaluation to post-launch monitoring workflows across safety-relevant domains, including golden set construction, labeling guidance, calibration, adjudication, and eval coverage analysis, to ensure policies can be reliably measured and improved.
- Use red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to improve policy and evaluation quality over time.
- Identify emerging capability areas where frontier AI systems could create new safety, fairness or bias challenges or lower barriers to harm.
- Monitor post-launch model activity to identify gaps in our policy framework to capture unsafe model behaviour.
- Champion research to strengthen the defensibility and operability of policy positions, including working with Outreach and Partnerships to incorporate external expert input into relevant policy positions.
- Combine longer-horizon safety research with hands-on launch and deployment work.
- Contribute to safety reports, policy documentation, launch reviews, and AI governance reviews on the company's approach to building AI responsibly.
- Support regulatory teams as a subject matter expert on AI compliance related initiatives.
Minimum Qualifications:
- 5 years in Trust & Safety, AI Safety Research, AI Ethics, technical AI Governance, or equivalent experience.
- Degree in Computer Science, Human-Computer Interaction, Engineering, Data Science or quantitative Social Sciences.
- Direct experience in policy development, AI evaluations, red-teaming, or AI governance work.
- Strong technical understanding of LLM, multimodel, or generative media model behavior, model failure modes, and safety risks.
- Demonstrated experience working with external experts and stakeholders, including civil society, government, and academia.
- Demonstrated success working in a fast-paced technology company or research organization conducting AI impact, risk assessments or algorithmic audits, and/or data science or product development related experience.
- Ability to advocate for safety amongst a wide variety of business stakeholders including Product Policy, Engineering, Public Policy, Legal, Communications, and Data Science.
Preferred Qualifications:
- Ability to explain complex technical concepts to non-technical stakeholders.
- Experience working with governments, frontier AI companies, or AI Safety organizations.
- Familiarity in Python and experience building ML systems
- Are comfortable working across the research-to-deployment pipeline, from exploratory experiments to production systems.
The T&S Responsible AI Policy team sits within the T&S GenAI and Emerging Products pillar. We work closely with Trust & Safety teams (product policy, product, engineering, data science, operations, red teaming), business and model teams, and cross-functional stakeholders (comms, legal, public policy) across global markets. Success in this team requires strong policy acumen, judgment, creativity, analytical rigour, and the ability to translate Generative AI risk to different stakeholders effectively.
As an AI Policy Researcher on the T&S Responsible AI Policy team, you will champion the responsible development and deployment of our frontier AI models across multiple businesses with a specialty on model bias, political risk, and model behaviour. You will accelerate technical policy research, incubate new research efforts, and drive end-to-end policy to evaluate workflows for your domain areas.
Responsibilities:
- Design and maintain multimodal GenAI policies across safety-relevant domains, including political and ideological bias, deceptive misuse, manipulation and persuasion, and fairness.
- Translate risk and harm models into clear behavioral specifications, evaluation criteria, grading guidance, and system-level safeguards.
- Define practical boundaries between beneficial uses of AI and assistance that could materially enable harm, exploitation, misuse, or unsafe outcomes.
- Build policy artifacts that support model training, evaluation, and deployment. Partner with safety researchers, engineers, product teams, and other stakeholders to operationalize policy into scalable model behavior and measurable safeguards.
- Design end-to-end policy development to pre-launch evaluation to post-launch monitoring workflows across safety-relevant domains, including golden set construction, labeling guidance, calibration, adjudication, and eval coverage analysis, to ensure policies can be reliably measured and improved.
- Use red-teaming results, deployment data, model failures, over-refusals, under-refusals, and ambiguous edge cases to improve policy and evaluation quality over time.
- Identify emerging capability areas where frontier AI systems could create new safety, fairness or bias challenges or lower barriers to harm.
- Monitor post-launch model activity to identify gaps in our policy framework to capture unsafe model behaviour.
- Champion research to strengthen the defensibility and operability of policy positions, including working with Outreach and Partnerships to incorporate external expert input into relevant policy positions.
- Combine longer-horizon safety research with hands-on launch and deployment work.
- Contribute to safety reports, policy documentation, launch reviews, and AI governance reviews on the company's approach to building AI responsibly.
- Support regulatory teams as a subject matter expert on AI compliance related initiatives.
Minimum Qualifications:
- 5 years in Trust & Safety, AI Safety Research, AI Ethics, technical AI Governance, or equivalent experience.
- Degree in Computer Science, Human-Computer Interaction, Engineering, Data Science or quantitative Social Sciences.
- Direct experience in policy development, AI evaluations, red-teaming, or AI governance work.
- Strong technical understanding of LLM, multimodel, or generative media model behavior, model failure modes, and safety risks.
- Demonstrated experience working with external experts and stakeholders, including civil society, government, and academia.
- Demonstrated success working in a fast-paced technology company or research organization conducting AI impact, risk assessments or algorithmic audits, and/or data science or product development related experience.
- Ability to advocate for safety amongst a wide variety of business stakeholders including Product Policy, Engineering, Public Policy, Legal, Communications, and Data Science.
Preferred Qualifications:
- Ability to explain complex technical concepts to non-technical stakeholders.
- Experience working with governments, frontier AI companies, or AI Safety organizations.
- Familiarity in Python and experience building ML systems
- Are comfortable working across the research-to-deployment pipeline, from exploratory experiments to production systems.
Apply for this position
Please mention that you found this job on MoAIJobs, this helps us grow. Thank you!
TikTok
53 jobs posted
About the job
Similar Jobs
- 1d
Technical AI Policy Researcher, Frontier Risk - Trust and Safety
TikTok
San Francisco, CATechnical AI Policy Researcher, Frontier Risk - Trust and Safety
TikTok
San Francisco, CA1d - 14d
Sr. Data Scientist, Trust and Safety
Pinterest
Remote$140K - $288KSan Francisco, CARemote, CASr. Data Scientist, Trust and Safety
Pinterest
Remote$140K - $288KSan Francisco, CARemote, CA14d - 28d
AI Safety Business Development Manager
AMD
Santa Clara, CaliforniaAI Safety Business Development Manager
AMD
Santa Clara, California28d - 18d
Machine Learning Engineer Intern, Trust and Safety Engineering - 2027 Start (PhD)
TikTok
Sydney, NSW, AustraliaMachine Learning Engineer Intern, Trust and Safety Engineering - 2027 Start (PhD)
TikTok
Sydney, NSW, Australia18d - 19d
Machine Learning Engineer, Trust & Safety
Match Group
New YorkMachine Learning Engineer, Trust & Safety
Match Group
New York19d - 15d
Machine Learning Engineer Intern, Trust and Safety Engineering - 2027 Start (PhD)
TikTok
Sydney, NSW, AustraliaMachine Learning Engineer Intern, Trust and Safety Engineering - 2027 Start (PhD)
TikTok
Sydney, NSW, Australia15d - 14d
Data Scientist II - Trust & Safety
Spotify
$117K - $167KNew York, NYBoston, MAData Scientist II - Trust & Safety
Spotify
$117K - $167KNew York, NYBoston, MA14d - 23d
Senior Machine Learning Engineer - Policy & Safety
Spotify
LondonStockholmSenior Machine Learning Engineer - Policy & Safety
Spotify
LondonStockholm23d - 23d
Senior Machine Learning Engineer - Policy & Safety
Spotify
$184K - $263KNew York, NYSenior Machine Learning Engineer - Policy & Safety
Spotify
$184K - $263KNew York, NY23d - 21d
Staff Machine Learning Engineer - Policy & Safety
Spotify
$227K - $325KNew York, NYStaff Machine Learning Engineer - Policy & Safety
Spotify
$227K - $325KNew York, NY21d