Principal Machine Learning Engineer, AI Platform (Foundation Model Post-Training)
Posted 90 days ago
Job Description
This job posting has expired and no longer accepting applications.
Company Description
About Grab and Our Workplace
Grab is Southeast Asia's leading superapp. From getting your favourite meals delivered to helping you manage your finances and getting around town hassle-free, we've got your back with everything. In Grab, purpose gives us joy and habits build excellence, while harnessing the power of Technology and AI to deliver the mission of driving Southeast Asia forward by economically empowering everyone, with heart, hunger, honour, and humility.
Job Description
Get to Know the Team
The AI Platform team empowers Grab teams to leverage advanced AI seamlessly and effectively. We're building cutting-edge tools and infrastructure to democratize AI capabilities, accelerate innovation, and enhance Grab's products and services at scale.
Get to Know the Role
As a Principal Machine Learning Engineer focused on Foundation Model Post-Training, you'll report into the Head of Engineering, Machine Learning and Experimentation Platforms and work onsite in Grab One North Singapore office.
You'll be the technical anchor for aligning our large-scale foundation models with human intent and domain requirements. You'll architect pipelines using Supervised Fine-Tuning (SFT) and RLHF to transform raw base models into safe, high-performance products for Grab. You'll also bridge deep learning research, systems engineering, and data strategy, requiring a leader to drive technical direction and execute large-scale experiments.
The Critical Tasks You Will Perform
- Strategic Technical Leadership: Define and drive the roadmap for post-training strategies, including SFT, RLHF (PPO/DPO/GPRO), and instruction tuning, to improve model alignment, safety, and reasoning capabilities.
- Pipeline Architecture: Design and implement robust, scalable, and distributed training pipelines using frameworks like PyTorch, DeepSpeed, Ray or Megatron-LM to handle models with billions of parameters.
- Data Strategy & Curation: Oversee the data engine for post-training; collaborate with data teams to design high-quality instruction sets, manage human annotation workflows, and implement automated data filtering/deduplication techniques.
- Evaluation & Benchmarking: Develop comprehensive evaluation suites (both automated benchmarks and human-in-the-loop protocols) to rigorously measure model performance, hallucination rates, and alignment drift.
- Optimization & Efficiency: Optimize training jobs for GPU utilization and cost-efficiency, including quantization, distillation, LoRA/Q-LoRA implementation, and memory optimization techniques.
- Cross-Functional Collaboration: Partner with multi-functional teams to translate user requirements into specific reward functions and fine-tuning objectives.
- Bridge Research and Engineering: Translate the latest AI research into robust, scalable, production-grade systems that drive tangible business outcomes.
- Mentorship: Provide technical mentorship, foster innovation, and inspire excellence across engineering, research, and product teams.
Qualifications
The Must-Haves
- Proven Experience: At least 8 years of professional experience in Machine Learning, with at least 3 years directly focused on NLP, LLMs, or Generative AI, and at least 2 years in technical leadership, mentorship, or people management.
- Post-Training Expertise: Experience training Large Language Models (LLMs) specifically in post-training stages. Experience with RLHF (Reinforcement Learning from Human Feedback), DPO (Direct Preference Optimization), GRPO (Group Relative Policy Optimization), and SFT (Supervised Fine-Tuning).
- Distributed Systems Mastery: Hands-on experience with distributed training of massive models across multi-node GPU clusters (e.g., A100/H100 pods) using Kubernetes or Ray.
- Framework Proficiency: Expert-level fluency in Python and deep learning frameworks (PyTorch, JAX). Familiarity with the Hugging Face ecosystem and training libraries like DeepSpeed, Megatron-LM, or FSDP.
- Data Intuition: Experience in dataset engineering including cleaning, balancing, and synthesizing high-quality instruction data. You have experience in large-scale data processing frameworks like Spark, Ray or Dask.
- Mathematical Depth: Solid grasp of the underlying mathematics of Transformers, optimization algorithms (AdamW, Lion), and probability theory as it applies to language modelling.
Additional Information
Life at Grab
We care about your well-being at Grab, here are some of the global benefits we offer:
- We have your back with Term Life Insurance and comprehensive Medical Insurance.
- With GrabFlex, create a benefits package that suits your needs and aspirations.
- Celebrate moments that matter in life with loved ones through Parental and Birthday leave, and give back to your communities through Love-all-Serve-all (LASA) volunteering leave
- We have a confidential Grabber Assistance Programme to guide and uplift you and your loved ones through life's challenges.
- Balancing personal commitments and life's demands are made easier with our FlexWork arrangements such as differentiated hours
What We Stand For at Grab
We are committed to building an inclusive and equitable workplace that enables diverse Grabbers to grow and perform at their best. As an equal opportunity employer, we consider all candidates fairly and equally regardless of nationality, ethnicity, religion, age, gender identity, sexual orientation, family commitments, physical and mental impairments or disabilities, and other attributes that make them unique.
This job posting has expired and no longer accepting applications. Please check out our latest AI jobs.
Grab
36 jobs posted
About the job
Similar Jobs
27d
Machine Learning Engineer, AI Evaluation
Wayve
LondonMachine Learning Engineer, AI Evaluation
Wayve
London27d
14dSenior Machine Learning Engineer - AI Foundation
XPENG
$175K - $296KSanta Clara, CA
Senior Machine Learning Engineer - AI Foundation
XPENG
$175K - $296KSanta Clara, CA14d
14dStaff Machine Learning Engineer - AI Foundation
XPENG
$215K - $364KSanta Clara, CA
Staff Machine Learning Engineer - AI Foundation
XPENG
$215K - $364KSanta Clara, CA14d12d
Principal Machine Learning Engineer(GenAI)
Workday
$201KUSA, COPrincipal Machine Learning Engineer(GenAI)
Workday
$201KUSA, CO12d7d
Principal Machine Learning Engineer - Reliability
Roblox
$295K - $345KSan Mateo, CAPrincipal Machine Learning Engineer - Reliability
Roblox
$295K - $345KSan Mateo, CA7d14d
Principal Machine Learning Engineer, Engineering Efficiency
Roblox
$345K - $399KSan Mateo, CAPrincipal Machine Learning Engineer, Engineering Efficiency
Roblox
$345K - $399KSan Mateo, CA14d
5dMachine Learning Engineer - Multi-Modality Foundation Model
Zoox
Foster City, CABoston, MA
Machine Learning Engineer - Multi-Modality Foundation Model
Zoox
Foster City, CABoston, MA5d22d
Machine Learning Engineer
Faculty
LondonMachine Learning Engineer
Faculty
London22d21d
Machine Learning Engineer
Faculty
LondonMachine Learning Engineer
Faculty
London21d13d
Principal Machine Learning Engineer
Workday
$228KUSA, CAPrincipal Machine Learning Engineer
Workday
$228KUSA, CA13d
Looking for something different?
Browse all AI jobsFree AI job alerts
Get the latest AI jobs delivered to your inbox every week. Free, no spam.