Senior Cloud DevOps/Site Reliability Engineer (SRE) - Canada

Vancouver, British Columbia, Canada

Job Description

This job posting has expired and no longer accepting applications.

Why Join Inworld

Inworld is the best-funded startup in AI and gaming with a $500 million valuation and backing from top tier investors like Intel, Microsoft, Lightspeed, Bitkraft, Founders Fund, Kleiner Perkins, and more. Inworld was recognized by CB Insights list of the 100 most promising AI companies in the world. We’ve also been nominated alongside Anthropic, DeepMind, OpenAI and Nvidia for the Generative AI Innovator of the Year at the VentureBeat Awards 2023, and are a Gartner Cool Vendor in 2023.

Inworld is the leading character engine for creating AI NPCs in games and immersive entertainment. Inworld powers NPCs in experiences built by Niantic, NetEase Games, LG, Alpine Electronics, the Disney Accelerator, and more. We go beyond large language models (LLMs) to add multimodal orchestration of personality and contextual awareness that renders NPCs within the lore and logic of their worlds.

We are looking for a Senior Cloud DevOps/Site Reliability Engineer to keep the Inworld systems up and running with the highest level of availability, security, and performance.

Qualifications

Bachelor's degree in Computer Science, Engineering, or a related field
5+ years of experience as a site reliability engineer, DevOps, and/or MLOps engineer
Experience administering and troubleshooting Linux systems
Understanding of CI/CD pipelines and Infrastructure as a Code (Terraform or similar)
Experience working with cloud environments like AWS, Azure, or Google Cloud
Experience with Python or Golang
Experience with developer tools, productivity, or operations automation
Experience with orchestrating big containerized deployments (Kubernetes - GKE/EKS or similar)
Experience with setting up logging and monitoring pipelines (Prometheus, Grafana, Datadog, etc.)
Knowledge of different deployment strategies
Experience with high-performance and high-available distributed NoSQL and SQL databases, analytics engines, message brokers, and queueing systems
Experience in designing , building and maintaining Machine Learning Inference or/and training infrastructure is considered a plus

Responsibilities

Work alongside the engineering team to ensure the delivery, scalability, and reliability of the Inworld services
Measure and monitor availability, latency, and overall service health
Create and support CI/CD pipelines
Organize multistage deployment environments
Drive incident management and post-mortem analysis

In-office location: Vancouver, Canada.

Remote location: Canada.

Inworld Jobs Privacy

Please mention that you found this job on MoAIJobs, this helps us grow. Thank you!