Machine Learning Research Engineer - Generative AI

Bangalore, India

Job Description

This job posting has expired and no longer accepting applications.

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.

AMD together we advance_

THE ROLE:

The AI Models and Applications team is looking for an exceptional machine learning scientist / engineer to explore and innovate on architectures and training techniques for large language models (LLMs), large multimodal models (LMMs), image/video generation and other foundation models. You will be part of a world-class research and development team focussing on efficient and scalable pre-training, instruction tuning, alignment and optimization. As an early member of the team, you can help us shape the direction and strategy to fulfill this important charter.

THE PERSON:

This role is for you if you are passionate about reading through the latest literature, coming up with novel ideas, and implementing those through high quality code to push the boundaries on scale and performance. The ideal candidate will have both theoretical expertise and hands-on experience with developing LLMs, LMMs, and/or diffusion models. We are looking for someone who is familiar with hyper-parameter tuning methods, data preprocessing & encoding techniques and distributed training approaches for large models.

KEY RESPONSIBILITIES:

Pre-train and finetune over large GPU clusters while optimizing for various trade-offs.
Improve upon the state-of-the-art in Generative AI model architectures and training techniques.
Accelerate the training and inference speed across AMD accelerators.
Publish your work at top-tier conferences & workshops and/or through technical blogs.
Engage with academia and open-source ML communities.
Drive continuous improvement of infrastructure and development ecosystem.

PREFERRED EXPERIENCE:

Strong development and debugging skills in Python.
Experience in deep learning frameworks (like PyTorch or TensorFlow) and distributed training tools (like DeepSpeed or Pytorch Distributed).
Experience with fine-tuning methods (like RLHF & DPO) as well as parameter efficient techniques (like LoRA & DoRA).
Solid understanding of various types of transformers and state space models.
Strong publication record in top-tier conferences, workshops or journals.
Solid communication and problem-solving skills.

ACADEMIC CREDENTIALS:

Advanced degree (Master’s or PhD) in machine learning, computer science, artificial intelligence, or a related field is expected. Exceptional Bachelor’s degree candidates will also be considered.

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

Please mention that you found this job on MoAIJobs, this helps us grow. Thank you!