Research Scientist, Web Data
Posted 49 days ago
Job Description
This job posting has expired and no longer accepting applications.
At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.
Snapshot
Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority
The Role
Your job will be to own and lead improvements of meaningful chunks of the web data pipeline. Examples of such chunks are scraping (i.e., transforming raw HTML into clean text data to be used during training, potentially including relevant image data), data filtering (removing/down-weighting low-quality content) or adding new data sources (such as historical web crawl data).
Key responsibilities:
- Investigating current results to identify areas for improvement (e.g., based on user feedback or weak eval performance).
- Developing measurements of weakness, either as model eval or data pipeline statistics, to help drive progress.
- Setting out a medium-term agenda to improve the data pipeline, with feedback from peers and key stakeholders, and convincing others to join your efforts.
- Working with partner teams in GDM (and wider Google) to leverage existing solutions effectively and communicate necessary infrastructure improvements.
- Day-to-day execution by coding, running experiments and reviewing contributions.
About You
In order to set you up for success as a Research Scientist at Google DeepMind, we look for the following skills and experience:
- 3 years of experience working as a self-directed engineer or researcher, e.g., as senior software developer or graduate student.
- Developing large-scale data (>=100M examples) processing pipelines in Python and/or C++.
- Evaluating and investigating (pretrained) LLM performance.
In addition, the following would be an advantage:
- Filtering data based on heuristic and/or learned signals.
- Working with web data for LLM training, such as cleaning data, removing duplicates, identifying most valuable examples, etc.
- Developing advanced LLM metrics (e.g., execution-based, using auto-raters, etc.)
This job posting has expired and no longer accepting applications. Please check out our latest AI jobs.
DeepMind
50 jobs posted
About the job
Similar Jobs
Adobe
24 days agoResearch Scientist - (applied research)
NoidaView detailsOpenAI
6 days agoResearch Scientist, PhD
San Francisco, CAView detailsAnthropic
4 days agoResearch Engineer / Research Scientist, Tokens
New York City, NYSeattle, WASan Francisco, CA$350K - $500K/yrView detailsTikTok
22 days agoResearch Scientist, Intelligent Editing (Multimodality)
San Jose, CAView detailsDeepMind
7 days agoResearch Scientist, Autonomous Agents — RL
London, United KingdomView detailsWaymo
7 days agoSenior Applied Research Scientist, Perception
Mountain View, CASan Francisco, CA$204K - $259K/yrView detailsTikTok
2 days agoResearch Scientist, Intelligent Editing (Multimodality)
San Jose, CAView detailsAbridge
1 day agoResearch Scientist (Measurement and Evaluation)
NYC OfficeView detailsUpstart
21 days agoResearch Scientist Intern
RemoteUnited States$141K - $150K/yrView detailsLuma AI
20 days ago(Internship) Research Scientist / Engineer — Foundation Model
Palo Alto, Palo Alto, CanadaView details
Looking for something different?
Browse all AI jobsNever miss a new AI job
Get the latest AI jobs delivered to your inbox every week. Free, no spam.
