Machine Learning Engineer
Posted 683 days ago
Job Description
This job posting has expired and no longer accepting applications.
About RapidFire AI
RapidFire AI is a cutting-edge deep tech startup specializing in scaling Machine Learning solutions. We are dedicated to empowering customers to effortlessly scale their AI workloads, ensuring they stay at the forefront of innovation in their industries.
About the Role
We are seeking a highly motivated and skilled Machine Learning Engineer to join our growing team. In this role, you will be responsible for developing distributed infrastructure for Deep Learning (DL) applications on the cloud, as well as contributing towards the design of newer features. You will collaborate closely with other developers and customer-facing personnel to ensure a seamless product experience.
Responsibilities:
- Design, develop, deploy, and maintain large-scale DL infrastructure software, following best practices and SE guidelines
- Contribute to designing efficient distributed systems that can scale DL computations, be modular and fault tolerant
- Automate the set up, launch, and orchestration of end-to-end training and experimentation pipelines written with PyTorch, Tensorflow, or KerasUse and extend libraries like FSDP, DDP, DeepSpeed, and GPipe to train DL models across multiple GPUs
- Use tools like Pandas and Dask to handle large multimodal datasets
- Troubleshoot code and fix bugs to ensure smooth functioning of the applicationMonitor and troubleshoot cluster resource usage to ensure optimal performance
- Conform to continuous integration and continuous delivery (CI/CD) pipeline standards for code deployment
- Communicate effectively with the wider team to ensure successful application development and deployment
- Collaborate with other developers to define and implement cloud infrastructure strategies for DL applications
- Stay up-to-date with the latest advancements in DL and AI technologies and best practices
Required Skills:
- 4+ years programming experience with PythonProven experience as an ML Engineer working on deploying production model training and/or inferenceExcellent knowledge of the DL tools PyTorch and TensorFlow
- Excellent knowledge and experience of using DL systems libraries such as FSDP, DDP, DeepSpeed, and GPipe
- Familiarity with LLMs, finetuning, and associated conceptsFamiliarity with operating systems concepts, memory management, networking, and cloud computing
- Familiarity with AWS infrastructure components like EC2, S3, EBS, EFS, EKS, and LambdaBasic experience with version control systems (e.g., Git) and collaborative development workflows
- Understanding of CI/CD methodologies and toolsExcellent communication and collaboration skills
- Ability to work independently and as part of a team
- Strong problem-solving and analytical skills
- A passion for learning and staying updated with the latest ML technologies
Nice to Have:
- Familiarity with Docker and Kubernetes to integrate code with underlying layers of deployment
- Experience working on AWS or other public cloud providers
- Experience with ML usability tools such as MLFlow, W&B, or AWS Sagemaker
This job posting has expired and no longer accepting applications. Please check out our latest AI jobs.

AI Fund
3 jobs posted
About the job
Posted on
Mar 28, 2024
Apply before
Apr 27, 2024
Job typeFull-time
CategoryML Engineer
Skills
machine learningLLMDeep Learningmanagementaws
Similar Jobs
Optiver
27 days agoMachine Learning Engineer
Shanghai, ChinaView detailsWaymo
26 days agoMachine Learning Engineer
London, England, United Kingdom£120K - £130K/yrView detailsWaymo
26 days agoMachine Learning Engineer
Mountain View, CA$170K - $216K/yrView detailsOptiver
27 days agoMachine Learning Engineer
Shanghai, ChinaView detailsSalesforce
22 days agoMachine Learning Engineer
MexicoView detailsCalendly
20 days agoMachine Learning Engineer
RemoteUnited States$202K - $256K/yrView detailsTwilio
19 days agoMachine Learning Engineer
RemoteUnited States$139K - $173K/yrView detailsGrab
19 days agoMachine Learning Engineer
Petaling Jaya, Selangor, MalaysiaView detailsGrab
19 days agoMachine Learning Engineer
Beijing, ChinaView detailsGrab
19 days agoMachine Learning Engineer
Beijing, Beijing, ChinaView details
Looking for something different?
Browse all AI jobs