Research Engineer, Data Infrastructure
Posted 15 days ago
Job Description
By applying, you agree to our Applicant Privacy Policy.
You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research. You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs.
- Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems
- Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions.
- Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.
- Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.
- Metadata & Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.
- Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.
You might thrive in this role if you:
-
Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering.
-
Have experience or a strong interest in supporting foundational compute and storage platforms.
-
Are proficient in Python and enjoy solving the "brittle data lake" problem with modern, columnar storage standards.
-
Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.
-
Take pride in building and operating scalable, reliable, and secure systems from the ground up.
-
Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.
Mistral
6 jobs posted
About the job
Similar Jobs
1d
Research Engineer, Data Infrastructure
Mistral
Palo AltoResearch Engineer, Data Infrastructure
Mistral
Palo Alto1d14d
Research Engineer, Infrastructure
Cognition
San Francisco, CAResearch Engineer, Infrastructure
Cognition
San Francisco, CA14d24d
Research Engineer
Graphcore
Bristol, UK; Cambridge, United KingdomCambridge, UKResearch Engineer
Graphcore
Bristol, UK; Cambridge, United KingdomCambridge, UK24d24d
Research Engineer
Graphcore
Bristol, UK; Cambridge, United KingdomCambridge, UKResearch Engineer
Graphcore
Bristol, UK; Cambridge, United KingdomCambridge, UK24d8d
Research Engineer
Hedra
$175K - $275KSan Francisco, CAResearch Engineer
Hedra
$175K - $275KSan Francisco, CA8d8d
Research Engineer
Hedra
$175K - $275KSan Francisco, CAResearch Engineer
Hedra
$175K - $275KSan Francisco, CA8d29d
Research Engineer, Agents
Anthropic
Remote$500K - $850KSan Francisco, CASeattle, WANew York City, NYResearch Engineer, Agents
Anthropic
Remote$500K - $850KSan Francisco, CASeattle, WANew York City, NY29d8d
Research Engineer, Multimodal
Character AI
Redwood City, CAResearch Engineer, Multimodal
Character AI
Redwood City, CA8d9d
Research Engineer, LLMs
Mirage
$175K - $275KUnited StatesResearch Engineer, LLMs
Mirage
$175K - $275KUnited States9dToday
Research Engineer, Calibration
Waabi
Remote$158K - $269KSan Francisco, CAToronto, ON, CanadaPittsburgh, PARemote US & CanadaResearch Engineer, Calibration
Waabi
Remote$158K - $269KSan Francisco, CAToronto, ON, CanadaPittsburgh, PARemote US & CanadaToday
Looking for something different?
Browse all AI jobsFree AI job alerts
Get the latest AI jobs delivered to your inbox every week. Free, no spam.