1 week ago

Tech Lead - Infrastructure Engg

Bangalore

About Us

Observe.AI is the fastest way to boost contact center performance with live conversation intelligence. Built on the most accurate AI engine in the industry, Observe.AI uncovers insights from 100% of customer interactions and maximizes frontline team performance through coaching and end-to-end workflow automation. With Observe.AI, companies can act faster with real-time insights and guidance to improve performance, from more sales to higher retention.

Observe.AI is trusted by hundreds of customers and partners, including Pearson, Accolade, Group 1 Automotive, Southeast Trans, and Public Storage. Raised our $125 million Series C led by Softbank Vision Fund 2 with participation from Zoom Video Communications, Inc., brings our total funding to date to $213M, with investments from Menlo Ventures, Next47, NGP Capital, Emergent Ventures, Scale Ventures, Nexus Ventures, and Y-Combinator. For more information, visit www.observe.ai.

The Opportunity

We are an AI-driven company pushing the boundaries of innovation with cutting-edge machine learning and AI models. Our infrastructure is at the core of our AI workflows, and we are looking for a Tech Lead - Infrastructure Engineering to drive our AI infrastructure's reliability, scalability, and performance.

About the Team

Our Infrastructure/DevOps team is a dynamic group of skilled engineers operating in a fast-paced Agile environment. We manage a robust multi-region infrastructure across the globe, leveraging AWS, Kubernetes, and Harness for efficient deployments and seamless application runtime management.

Collaboration is at our core, with daily stand-ups and bi-weekly sprints ensuring alignment and continuous progress. Innovation thrives here; team members are encouraged to experiment with new technologies and share ideas that drive impactful solutions.

We foster growth through mentorship programs, regular skill development workshops, and ample career advancement opportunities.

What you’ll be doing

  • Manager Self-Hosting tools: Lead the transition from managed services to self-hosted Elasticsearch, Prometheus, and other critical infrastructure components to optimize performance and cost.
  • Optimize AI Infrastructure: Work closely with ML engineers and data scientists to efficiently deploy and scale AI/ML models, ensuring high availability and low-latency inference.
  • Infrastructure Scalability & Reliability: Design and implement scalable, fault-tolerant systems capable of handling large-scale AI workloads, distributed training, and high-throughput data pipelines.
  • Technology Evaluation & Implementation: Continuously assess and introduce new technologies to enhance automation, reliability, and security in AI model deployment and training pipelines.
  • CI/CD for AI Workflows: Enhance and automate ML model deployment pipelines using MLOps best practices and tools like Kubeflow, MLflow, and Argo Workflows.
  • Observability & Monitoring: Implement and enhance monitoring, logging, and alerting strategies using Prometheus, Grafana, ELK, OpenTelemetry, etc., tailored for AI workloads.
  • Security Best Practices: Implement security measures for AI data pipelines, model storage, and cloud infrastructure.

What you bring to the role

  • 8+ years of experience in DevOps, SRE, or Cloud Infrastructure roles, preferably in AI or data-intensive environments.
  • Strong expertise in Kubernetes (EKS, AKS preferred ) for deploying AI workloads and managing GPU & non-CPU clusters.
  • Experience with self-hosting services like Elasticsearch, Prometheus, Grafana, Kafka, etc.
  • Hands-on expertise in Infrastructure as Code (Terraform, CloudFormation).
  • Deep understanding of cloud platforms (AWS, Azure, GCP) and AI-focused services like AWS Sagemaker, Vertex AI, or Azure ML.
  • Strong automation and scripting skills in Python, Bash, or Go.
  • Experience in CI/CD tools (Jenkins, GitHub Actions, ArgoCD, etc.) with a focus on AI model deployment.
  • Strong leadership and mentorship skills to guide DevOps and ML teams.

Nice to have:

  • FinOps expertise for optimizing GPU and AI cloud compute costs.
  • Familiarity with service meshes (Istio, Linkerd) and API gateways.
  • Knowledge of compliance frameworks (SOC2, ISO 27001, etc.) for AI data pipelines.

Why join us?

  • Work at the intersection of AI and cloud infrastructure, solving real-world AI scaling challenges.
  • Collaborate with AI researchers and ML engineers to optimize AI model deployment.
  • Competitive salary, benefits, and opportunities for career growth.
  • Flexible work environment with a cutting-edge tech stack.

Compensation, Benefits, and Perks

  • Excellent medical insurance options and free online doctor consultations
  • Yearly privilege and sick leaves as per Karnataka S&E Act
  • Generous holidays (National and Festive,) recognition, and parental leave policies
  • Learning & Development fund to support your continuous learning journey and professional development
  • Fun events to build culture across the organization
  • Flexible benefit plans for tax exemptions (i.e. Meal card, PF, etc.)

Our Commitment to Inclusion and Belonging

Observe.AI is an Equal Employment Opportunity employer that proudly pursues and hires a diverse workforce. Observe AI does not make hiring or employment decisions on the basis of race, color, religion or religious belief, ethnic or national origin, nationality, sex, gender, gender identity, sexual orientation, disability, age, military or veteran status, or any other basis protected by applicable local, state, or federal laws or prohibited by Company policy. Observe.AI also strives for a healthy and safe workplace and strictly prohibits harassment of any kind.

We welcome all people. We celebrate diversity of all kinds and are committed to creating an inclusive culture built on a foundation of respect for all individuals. We seek to hire, develop, and retain talented people from all backgrounds. Individuals from non-traditional backgrounds, historically marginalized or underrepresented groups are strongly encouraged to apply.

If you are ambitious, make an impact wherever you go, and you're ready to shape the future of Observe.AI, we encourage you to apply. For more information, visit www.observe.ai

Please mention that you found this job on MoAIJobs, this helps us grow. Thank you!

Share this job opportunity

Related Jobs

Gatik
2 weeks ago

Staff/Tech Lead- ML Infrastructure Engineer

Mountain View, CA
OpenAI
2 weeks ago

Infrastructure Commercial Lead

San Francisco
Mastercard
2 weeks ago

Backend tech lead

Ramat-Gan, Israel
xAI
13 hours ago

Infrastructure Finance Lead

San Francisco & Palo Alto, CA
Together AI
1 week ago

Lead Cloud Infrastructure Engineer

San Francisco