Chan Zuckerberg Initiative
Company
Staff Research Engineer, AI Engineering, Science
Job Description
The Team
CZI supports the science and technology that will make it possible to help scientists cure, prevent, or manage all diseases by the end of this century. While this may seem like an audacious goal, in the last 100 years, biomedical science has made tremendous strides in understanding biological systems, advancing human health, and treating disease.
Achieving our mission will only be possible if scientists are able to better understand human biology. To that end, we have identified four grand challenges that will unlock the mysteries of the cell and how cells interact within systems — paving the way for new discoveries that will change medicine in the decades that follow:
- Building an AI-based virtual cell model to predict and understand cellular behavior
- Developing state-of-the-art imaging systems to observe living cells in action
- Instrumenting tissues to better understand inflammation, a key driver of many diseases
- Engineering and harnessing the immune system for early detection, prevention, and treatment of disease
CZI’s work in science includes grantmaking programs, open-source software development, and close collaboration with the Chan Zuckerberg Biohub Network. The CZ Biohub Network includes the San Francisco, Chicago, and New York Biohubs as well as the Chan Zuckerberg Imaging Institute. CZI also collaborates with institutional partners like the Kempner Institute for the Study of Natural & Artificial Intelligence at Harvard University. Join us in accelerating science.
The AI/ML team is funding and building one of the largest computing systems dedicated to nonprofit life sciences research in the world. This new effort will provide the scientific community with access to predictive models of healthy and diseased cells, which will lead to groundbreaking new discoveries that could help researchers cure, prevent, or manage all diseases by the end of this century.
The Opportunity
We are seeking a Senior Director of Engineering to lead our infrastructure organization spanning AI infrastructure engineering, AI/ML operations, data infrastructure, cloud infrastructure, and security engineering. This leader will drive strategy, execution, and innovation to support AI research, web products, and production workloads across hybrid environments (cloud and on-prem HPC). This team manages the largest cluster for scientific research in the world with more than 1,300 GPUs (NVIDIA H100 and H200 GPUs) and enables scientific research and development of various biological models (like GREmLN, TranscriptFormer) from vast biological datasets acquired through our BioHub labs, partnerships and open science repositories. The role is highly cross-functional, partnering closely with product, research, and operations teams to deliver scalable, secure, and high-performing systems.
What You'll Do
- Define and execute the long-term vision and roadmap for AI, data, cloud, and security infrastructure, with clear metrics to measure progress and outcomes.
- Oversee the design and operation of hybrid GPU compute clusters and ML platforms to support training, fine-tuning, and inference workloads.
- Ensure robust, scalable, and efficient data infrastructure and cloud operations to power analytics, ML pipelines, and product needs.
- Drive reliability, observability, and cost optimization across GPU based workloads for development, training and inference.
- Implement modern AI/ML Ops practices (orchestration of model training workloads, reproducibility, automated monitoring) to accelerate research and production workflows, with a focus on continuous delivery and improvement.
- Build, mentor, and scale high-performing, multi-disciplinary engineering teams.
- Partner with product, research, and executive leadership to align infrastructure with organizational priorities, ensuring delivery is measured against agreed objectives and key results.
- Establish policies for infrastructure usage, prioritization, and compliance with regulatory requirements.
- Stay ahead of emerging technologies in AI infrastructure, cloud, and security; drive their strategic adoption.
What You'll Bring
- 15+ years in engineering, with at least 7+ years in senior leadership roles managing multi-disciplinary teams and organizations of 30+ employees, with experience leading and developing managers
- Strong knowledge of AI/ML frameworks (e.g., PyTorch) and MLOps tools (e.g., Kubeflow, MLflow, Ray).
- Experience managing both traditional cloud platforms (AWS, GCP, Azure) and AI cloud (HPC/GPU clusters).
- Deep experience with large-scale data systems, pipelines, and storage technologies.
- Track record of improving reliability, observability, and cost efficiency in large-scale systems.
- Proven ability to define multi-year infrastructure strategies while delivering on immediate priorities.
- Exceptional written and verbal communication skills, capable of engaging technical and non-technical audiences.
- Ability to provide clear leadership and momentum in an ambiguous environment—setting direction, aligning teams, and turning uncertainty into forward progress.
Compensation
The Redwood City, CA base pay range for a new hire in this role is $435,000 - $621,500. New hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process.
Work Mode
As we grow, we’re excited to strengthen in-person connections and cultivate a collaborative, team-oriented environment. This role is a hybrid position requiring you to be onsite for at least 60% of the working month, approximately 3 days a week, with specific in-office days determined by the team’s manager. The exact schedule will be at the hiring manager's discretion and communicated during the interview process.
Benefits for the Whole You
We’re thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible.
- CZI provides a generous employer match on employee 401(k) contributions to support planning for the future.
- Annual benefit for employees that can be used most meaningfully for them and their families, such as housing, student loan repayment, childcare, commuter costs, or other life needs.
- CZI Life of Service Gifts are awarded to employees to “live the mission” and support the causes closest to them.
- Paid time off to volunteer at an organization of your choice.
- Funding for select family-forming benefits.
- Relocation support for employees who need assistance moving to the Bay Area
- And more!
If you’re interested in a role but your previous experience doesn’t perfectly align with each qualification in the job description, we still encourage you to apply as you may be the perfect fit for this or another role.
Explore our work modes, benefits, and interview process at www.chanzuckerberg.com/careers.
#LI-Hybrid
Chan Zuckerberg Initiative
13 jobs posted
About the job
Similar Jobs
Discover more opportunities that match your interests
- 28 days ago
Staff Engineer- Gen AI
Coupang
BengaluruView details - 28 days ago
Staff Engineer, AI Microarchitecture
Samsung Semiconductor
San Jose, California, United StatesView details - 16 days ago
Research Engineer, Multimodal AI
DeepMind
Mountain View, California, USView details - 9 days ago
Staff Software Engineer, AI
GoodLeap
View details - 10 days ago
Staff Engineer, Applied AI
Match Group
View details - 10 days ago
Research Engineer, Agentic AI
DeepMind
Mountain View, California, USView details - 28 days ago
AI Research Engineer - Reinforcement Learning
Helsing
Berlin; London; Munich; Paris; WarsawView details - 27 days ago
Staff Software Engineer - AI Tools
PlayStation
RemoteView details - 27 days ago
Staff Software Engineer - AI Tools
PlayStation
RemoteView details - 24 days ago
Research Engineer (Data Science)
Ataraxis AI
New York HQView details
Looking for something different?
Browse all AI jobs