CareersEngineering
Engineering
Apply NowML Infrastructure Engineer
San Francisco / RemoteFull-time$200,000 - $280,000
About the Role
Join us to build the infrastructure that powers AI inference at scale. You'll design systems that efficiently route requests to the best models, handle failover, and optimize for cost and latency.
Responsibilities
- Design and build model serving infrastructure
- Implement intelligent request routing and load balancing
- Optimize inference latency and throughput
- Build monitoring and observability systems
- Work with AI providers to integrate new models
- Research and implement caching strategies
Requirements
- 5+ years of software engineering experience
- Experience with ML systems and model serving
- Strong knowledge of Python and systems programming
- Experience with Kubernetes, Docker, and cloud infrastructure
- Understanding of ML model formats and optimization techniques
- Experience with high-throughput distributed systems
Nice to Have
- Experience with CUDA and GPU optimization
- Background in building inference APIs
- Knowledge of transformer architectures
- Experience with Ray, vLLM, or similar tools
Benefits
- ✓Competitive salary + equity
- ✓100% health, dental, and vision coverage
- ✓Unlimited PTO
- ✓Remote-first with quarterly team retreats
- ✓$2,000 annual learning budget
- ✓$1,500 home office stipend