TikTok
Company
Data Engineer, AI Platform
San Jose
Job Description
Team Introduction
The Intelligent Creation - AI Platform team is a team focusing on building advanced end-to-end AI production pipelines, including deep learning model training, optimization, deployment and applications. We provide AI capabilities to empower content creation and consumption on TikTok and serve billions of users.
Responsibilities
- Own the management of massive GenAI datasets (PB) scale, including storage for video/image data, data processing, and data validation.
- Build and evolve GenAI data-pipeline infrastructure with a focus on extreme processing speed and end-to-end throughput.
- Partner closely with ML researchers/engineers to accelerate training-data acquisition, improve data quality, support evaluation of model outputs, and enable a closed-loop data lifecycle.
- Lower the barrier to data acquisition and maximize the utility and value of data across use cases.
Minimum Qualifications
- 3+ years of experience building data services; strong proficiency in Python.
- Experience with high-concurrency and asynchronous programming is a plus.
- Hands-on with Hive, MySQL, MongoDB, and Elasticsearch; solid understanding of internals; capable of data abstraction and data modeling.
- Practical experience with large-scale data processing frameworks such as Hadoop, Spark, Flink, and Ray.
- Excellent communication and collaboration skills; detail-oriented; strong problem-solving and analytical abilities.
Preferred Qualifications
- 1–2 years of experience with the Ray framework, including proficient orchestration of GPU/CPU resources; deep understanding of Ray’s architecture and usage patterns; strong background in high-concurrency and async processing to boost overall throughput.
- Experience sourcing and curating training datasets for GenAI.
The Intelligent Creation - AI Platform team is a team focusing on building advanced end-to-end AI production pipelines, including deep learning model training, optimization, deployment and applications. We provide AI capabilities to empower content creation and consumption on TikTok and serve billions of users.
Responsibilities
- Own the management of massive GenAI datasets (PB) scale, including storage for video/image data, data processing, and data validation.
- Build and evolve GenAI data-pipeline infrastructure with a focus on extreme processing speed and end-to-end throughput.
- Partner closely with ML researchers/engineers to accelerate training-data acquisition, improve data quality, support evaluation of model outputs, and enable a closed-loop data lifecycle.
- Lower the barrier to data acquisition and maximize the utility and value of data across use cases.
Minimum Qualifications
- 3+ years of experience building data services; strong proficiency in Python.
- Experience with high-concurrency and asynchronous programming is a plus.
- Hands-on with Hive, MySQL, MongoDB, and Elasticsearch; solid understanding of internals; capable of data abstraction and data modeling.
- Practical experience with large-scale data processing frameworks such as Hadoop, Spark, Flink, and Ray.
- Excellent communication and collaboration skills; detail-oriented; strong problem-solving and analytical abilities.
Preferred Qualifications
- 1–2 years of experience with the Ray framework, including proficient orchestration of GPU/CPU resources; deep understanding of Ray’s architecture and usage patterns; strong background in high-concurrency and async processing to boost overall throughput.
- Experience sourcing and curating training datasets for GenAI.
TikTok
223 jobs posted
About the job
Similar Jobs
Discover more opportunities that match your interests
- 21 days ago
Data Engineer- Data Science Platform
Visa
Foster City, CA, USView details - 20 days ago
Senior Data Engineer, Data Platform
Otter
Mountain View, CAView details - 30 days ago
Data Engineer
Waymo
Mountain View, California, USAView details - 30 days ago
AI Engineer - IT Platform Development
Spotify
View details - 29 days ago
Data Engineer
GoDaddy
IndiaView details - 28 days ago
Data Engineer
Wix
Tel Aviv, ILView details - 27 days ago
Data Engineer
Exa
San Francisco, CaliforniaView details - 23 days ago
Data Engineer
Qloo
View details - 22 days ago
Data Engineer
Wix
Tel Aviv-Yafo, Tel Aviv District, ILView details - 22 days ago
Software Engineer, Enterprise AI Platform
OpenAI
San FranciscoView details
View all Data Engineer jobs
Looking for something different?
Browse all AI jobs