Role: Senior Machine Learning Engineer (ML Infrastructure & Data Systems)
Location: New York, NY (Onsite)
Our client is an early-stage robotics and AI company building autonomous systems that operate in real-world industrial environments. Their platform focuses on automating complex, mission-critical workflows using advanced machine learning and robotic systems.
The company is already deploying production-grade systems in live environments and is now entering a rapid scaling phase. Their approach emphasizes fast iteration, learning from real-world data, and continuously improving system performance through tight feedback loops between deployment and model training.
They are building toward large-scale deployments across industrial settings, with a long-term vision of making advanced automation broadly accessible through intelligent, adaptive robotic systems.
Position Overview:
This role will own the machine learning infrastructure and data platform that powers large-scale model training and deployment. As data volume rapidly grows, this person will design and scale systems that ingest, process, and serve massive multimodal datasets (including video) for real-time and offline training.
The ideal candidate combines deep experience in large-scale data systems with strong intuition for machine learning infrastructure, particularly in environments where reliability, performance, and iteration speed are critical.
Key Responsibilities:
- Own and scale the data platform to ingest high-volume streaming data (e.g., video) and make it available for real-time training workflows
- Build and manage end-to-end ML infrastructure, including distributed training, experiment tracking, and compute orchestration
- Design high-performance data access layers and storage systems for petabyte-scale multimodal datasets
- Partner closely with research teams to design, run, and iterate on experiments improving model performance
- Ensure reliability, scalability, and high availability of critical ML and data infrastructure systems
Qualifications:
✅ You likely fit if you:
- Have deep experience building and operating large-scale (PB+) data systems
- Are comfortable with real-time processing, streaming pipelines, and event-driven architectures
- Have built and scaled ML training and inference systems in production environments
- Thrive in high-autonomy environments with minimal oversight
- Are motivated by fast-paced environments with high ownership and impact
- Have a strong interest in robotics, autonomy, or physical AI systems
- ➕ Nice to Have:
- Experience with large-scale datasets in robotics or autonomous systems
- Hands-on experience with distributed training at scale (e.g., large GPU workloads)
- Familiarity with video processing, compression, and efficient storage systems
- Experience with reinforcement learning, imitation learning, or multimodal model pipelines
- Exposure to hardware-integrated systems in production environments
❌ This role is not for you if:
- Prefer narrowly scoped roles without ownership of systems end-to-end
- Are uncomfortable working with large-scale, high-throughput data systems
- Prefer highly structured environments with clearly predefined requirements
What they offer:
- Base salary: $175k – $250k
- Equity participation
- Comprehensive benefits package
- Opportunity to build foundational infrastructure for real-world AI systems at scale
- High-impact role within a fast-growing, early-stage company