Research Scientist, Intelligent Editing and AI Agent (Multimodality) Graduate (Intelligent Creation) - Global Frontier Tech Recruitment Program - 2027 Start (PhD)
Posted 5 hours ago
Job Description
We are looking for talented individuals to join our team in 2027. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at our Company.
Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume.
About the Team
The Intelligent Creation - Global GenAI team focuses on applied research in Generative AI, and delivers intelligent solutions to TikTok, enabling users to make and share creative content in a much easier way. The team has research groups dedicated to multimodal foundation models for content creation, image/video generation and editing, efficient modeling, and world models.
Topic Content:
As AGI large model technology advances, the way AI creates multimodal images, text, and videos is changing deeply. New creative solutions based on generative AI and agent technologies keep appearing. Multimodal creation large models use cutting-edge methods like full-modal content understanding, AIGC image and video generation, and agentic foundation models to build flexible, efficient, and industry-leading ways to create multimedia content. Through continual training and post-training, these models steadily raise their abilities in content understanding and image/video generation, optimizing the foundational models end to end for creation agent scenarios.
Challenges:
- Deeply involved in continous training and post-training (SFT/RL) of Seed multimodal models and LLM.
- Participating in unified modeling for image and video generation, driving model performance improvements, and gaining hands-on experience in model iteration and large-scale training.
- Applying agent technology and architecture, optimizing tool-calling and long-horizon task capabilities of agentic foundation models, and conducting in-depth research on agentic RL.
Research value:
This topic focuses on the multimodal creation transformation in the AGI era, relying on full-modal understanding, AIGC generation, and agentic foundation models to build an efficient and intelligent multimedia creation system. Through ongoing training and model optimization, it constantly pushes forward content generation and understanding abilities, moving AI creation from reactive generation toward autonomous intelligence. This topic combines cutting-edge technology with practical industry value, providing core support for the next generation of intelligent creation.
Responsibilities
- Conduct cutting-edge research and development in computer vision and machine learning, especially in the areas of multi-modal understanding, vision and language, large-scale training, etc.
- Explore new products with artificial intelligence technology at its core.
Minimum Qualifications:
- Individuals who are completing or recently completed a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline.
- Excellent collaboration and interpersonal skills
- Highly competent in algorithms and programming;
- Strong coding skills in Python/C++.
Preferred Qualifications:
- Experience in multimodal understanding, such as video highlight detection and slicing, audio/music understanding, etc.
- Experience in vision and language, such as image/video captioning, retrieval, VQA, and other related fields.
- Experience with language models and apply them in various downstream tasks, especially for intelligent editing.
- Experience in large-scale training and RLHF.
- Preferring candidates with publications in venues such as CVPR, ECCV, ICCV, NeurIPS, ICLR, SIGGRAPH or SIGGRAPH Asia, etc
Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume.
About the Team
The Intelligent Creation - Global GenAI team focuses on applied research in Generative AI, and delivers intelligent solutions to TikTok, enabling users to make and share creative content in a much easier way. The team has research groups dedicated to multimodal foundation models for content creation, image/video generation and editing, efficient modeling, and world models.
Topic Content:
As AGI large model technology advances, the way AI creates multimodal images, text, and videos is changing deeply. New creative solutions based on generative AI and agent technologies keep appearing. Multimodal creation large models use cutting-edge methods like full-modal content understanding, AIGC image and video generation, and agentic foundation models to build flexible, efficient, and industry-leading ways to create multimedia content. Through continual training and post-training, these models steadily raise their abilities in content understanding and image/video generation, optimizing the foundational models end to end for creation agent scenarios.
Challenges:
- Deeply involved in continous training and post-training (SFT/RL) of Seed multimodal models and LLM.
- Participating in unified modeling for image and video generation, driving model performance improvements, and gaining hands-on experience in model iteration and large-scale training.
- Applying agent technology and architecture, optimizing tool-calling and long-horizon task capabilities of agentic foundation models, and conducting in-depth research on agentic RL.
Research value:
This topic focuses on the multimodal creation transformation in the AGI era, relying on full-modal understanding, AIGC generation, and agentic foundation models to build an efficient and intelligent multimedia creation system. Through ongoing training and model optimization, it constantly pushes forward content generation and understanding abilities, moving AI creation from reactive generation toward autonomous intelligence. This topic combines cutting-edge technology with practical industry value, providing core support for the next generation of intelligent creation.
Responsibilities
- Conduct cutting-edge research and development in computer vision and machine learning, especially in the areas of multi-modal understanding, vision and language, large-scale training, etc.
- Explore new products with artificial intelligence technology at its core.
Minimum Qualifications:
- Individuals who are completing or recently completed a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline.
- Excellent collaboration and interpersonal skills
- Highly competent in algorithms and programming;
- Strong coding skills in Python/C++.
Preferred Qualifications:
- Experience in multimodal understanding, such as video highlight detection and slicing, audio/music understanding, etc.
- Experience in vision and language, such as image/video captioning, retrieval, VQA, and other related fields.
- Experience with language models and apply them in various downstream tasks, especially for intelligent editing.
- Experience in large-scale training and RLHF.
- Preferring candidates with publications in venues such as CVPR, ECCV, ICCV, NeurIPS, ICLR, SIGGRAPH or SIGGRAPH Asia, etc
TikTok
63 jobs posted
About the job
Posted on
Apr 28, 2026
Apply before
May 28, 2026
Job typeFull-time
Location
San Jose, CA
Similar Jobs
Today
Senior Research Scientist, Intelligent Editing (Multimodality)
TikTok
San Jose, CASenior Research Scientist, Intelligent Editing (Multimodality)
TikTok
San Jose, CAToday28d
Vision Language Models/VLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
San Jose, CAVision Language Models/VLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
San Jose, CA28d27d
Research Scientist Graduate- CV/NLP/Multimodal LLM,(Trust and Safety) - 2026 Start(PhD)
TikTok
San Jose, CAResearch Scientist Graduate- CV/NLP/Multimodal LLM,(Trust and Safety) - 2026 Start(PhD)
TikTok
San Jose, CA27d27d
Research Scientist Graduate- CV/NLP/Multimodal LLM, Trust and Safety - 2026 Start(PhD)
TikTok
Seattle, WAResearch Scientist Graduate- CV/NLP/Multimodal LLM, Trust and Safety - 2026 Start(PhD)
TikTok
Seattle, WA27d23d
CV/NLP/Multimodal LLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
Seattle, WACV/NLP/Multimodal LLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
Seattle, WA23d19d
CV/NLP/Multimodal LLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
Seattle, WACV/NLP/Multimodal LLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
Seattle, WA19d6d
CV/NLP/Multimodal LLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
Seattle, WACV/NLP/Multimodal LLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)
TikTok
Seattle, WA6d27d
Research Scientist Intern- (Trust and Safety - CV/NLP/Multimodal LLM) - 2026 Start(PhD)
TikTok
Seattle, WAResearch Scientist Intern- (Trust and Safety - CV/NLP/Multimodal LLM) - 2026 Start(PhD)
TikTok
Seattle, WA27d27d
Research Scientist Intern- (Trust and Safety - CV/NLP/Multimodal LLM) - 2026 Start(PhD)
TikTok
San Jose, CAResearch Scientist Intern- (Trust and Safety - CV/NLP/Multimodal LLM) - 2026 Start(PhD)
TikTok
San Jose, CA27d22d
Research Scientist Intern- (Trust and Safety - CV/NLP/Multimodal LLM) - 2026 Start(PhD)
TikTok
San Jose, CAResearch Scientist Intern- (Trust and Safety - CV/NLP/Multimodal LLM) - 2026 Start(PhD)
TikTok
San Jose, CA22d
AI jobs in your inbox
Get the latest AI jobs delivered to your inbox every week. Free, no spam.