Mirage
Company
Software Engineer, ML Data Platform
Job Description
Mirage is the leading AI short-form video company. We’re building full-stack foundation models and products that redefine video creation, production and editing. Over 20 million creators and businesses use Mirage’s products to reach their full creative and commercial potential.
We are a rapidly growing team of ambitious, experienced, and devoted engineers, researchers, designers, marketers, and operators based in NYC. As an early member of our team, you’ll have an opportunity to have an outsized impact on our products and our company's culture.
Our Products
Our Technology
Press Coverage
Our Investors
We’re very fortunate to have some the best investors and entrepreneurs backing us, including Index Ventures, Kleiner Perkins, Sequoia Capital, Andreessen Horowitz, Uncommon Projects, Kevin Systrom, Mike Krieger, Lenny Rachitsky, Antoine Martin, Julie Zhuo, Ben Rubin, Jaren Glover, SVAngel, 20VC, Ludlow Ventures, Chapter One, and more.
** Please note that all of our roles will require you to be in-person at our NYC HQ (located in Union Square)
We do not work with third-party recruiting agencies, please do not contact us**
About the Role
We’re looking for a Software Engineer to help build and scale the data systems that power our machine learning products. This role sits at the intersection of data engineering and ML infrastructure: you’ll design large-scale streaming pipelines, build tools that abstract infrastructure complexity for feature developers, and ensure that our feature data is reliable, discoverable, and performant across online and offline environments. If you’re passionate about building foundational systems that enable machine learning at scale — and love solving complex distributed data problems — this is the role for you.
What You’ll Do
Design and scale feature pipelines: Build distributed data processing systems for feature extraction, orchestration, and serving — including real-time streaming, batch ingestion, and CDC workflows.
Feature Extraction: Design and implement reliable, reusable feature pipelines for ML models, ensuring features are accurate, scalable, and production-ready through well-designed SDKs and orchestration tools.
Build and evolve storage infrastructure: Manage multi-tier data systems (e.g. Bigtable for online features/state, BigQuery for analytics and offline training), including schema evolution, versioning, and compatibility.
Own orchestration and reliability: Lead workflow orchestration design (e.g. Pub/Sub, Busboy, Airflow/Temporal), monitoring, and alerting to ensure reliability at 100M+ video scale.
Collaborate with ML teams: Partner with ML engineers on feature availability, dataset curation, and streaming pipelines for training and inference.
Optimize for performance and cost: Tune GPU utilization, resource allocation, and data processing efficiency to maximize system throughput and minimize cost.
Enable analytics and insights: Support downstream analytics and data science workflows by ensuring data accessibility, discoverability, and performance at scale.
Preferred Qualifications
4+ years building distributed data systems, feature platforms, or ML infrastructure at scale.
Strong experience with streaming and batch pipelines (e.g. Pub/Sub, Kafka, Dataflow, Beam, Flink, Spark).
Deep knowledge of cloud-native data stores (e.g. Bigtable, BigQuery, DynamoDB, Snowflake) and schema/versioning best practices.
Proficiency in Python and experience building developer-facing libraries or SDKs.Experience with Kubernetes, containerized data infrastructure, and workflow orchestration tools (e.g. Airflow, Temporal).
Familiarity with ML workflows and feature store design — enough to partner closely with ML teams.
Bonus: Experience working with video, audio, or other unstructured media data in a production environment.
Benefits:
Comprehensive medical, dental, and vision plans
401K with employer match
Commuter Benefits
Catered lunch multiple days per week
Dinner stipend every night if you're working late and want a bite!
Grubhub subscription
Health & Wellness Perks (Talkspace, Kindbody, One Medical subscription, HealthAdvocate, Teladoc)
Multiple team offsites per year with team events every month
Generous PTO policy
Captions provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Please note benefits apply to full time employees only.
Mirage
2 jobs posted
About the job
Similar Jobs
Discover more opportunities that match your interests
- 2 days ago
Software Engineer - ML Platform
Motive
RemoteView details - 16 days ago
Staff Software Engineer, ML Platform
Pinterest
RemoteView details - 11 days ago
Software Engineer, AI Data Platform
OpenAI
San FranciscoView details - 5 days ago
Staff Software Engineer, ML Platform
Attentive
View details - 25 days ago
Software & Data Engineer
Ursa Major
Berthoud, COView details - 19 days ago
Software & Data Engineer
Ursa Major
Berthoud, COView details - 18 days ago
Software & Data Engineer
Ursa Major
Berthoud, ColoradoView details - 17 days ago
Software & Data Engineer
Ursa Major
Berthoud, ColoradoView details - 25 days ago
Senior Software Engineer- ML Data Infra
Waymo
Mountain View, CA, USAView details - 4 days ago
Senior Software Engineer, ML Feature Platform
Reddit
RemoteView details
Looking for something different?
Browse all AI jobs