CareersEngineering
Engineering

ML Infrastructure Engineer

San Francisco / RemoteFull-time$200,000 - $280,000
Apply Now

About the Role

Join us to build the infrastructure that powers AI inference at scale. You'll design systems that efficiently route requests to the best models, handle failover, and optimize for cost and latency.

Responsibilities

  • Design and build model serving infrastructure
  • Implement intelligent request routing and load balancing
  • Optimize inference latency and throughput
  • Build monitoring and observability systems
  • Work with AI providers to integrate new models
  • Research and implement caching strategies

Requirements

  • 5+ years of software engineering experience
  • Experience with ML systems and model serving
  • Strong knowledge of Python and systems programming
  • Experience with Kubernetes, Docker, and cloud infrastructure
  • Understanding of ML model formats and optimization techniques
  • Experience with high-throughput distributed systems

Nice to Have

  • Experience with CUDA and GPU optimization
  • Background in building inference APIs
  • Knowledge of transformer architectures
  • Experience with Ray, vLLM, or similar tools

Benefits

  • Competitive salary + equity
  • 100% health, dental, and vision coverage
  • Unlimited PTO
  • Remote-first with quarterly team retreats
  • $2,000 annual learning budget
  • $1,500 home office stipend

How to Apply

Send your resume and a brief note about why you're interested to:

careers@abstrakt.one

Questions?

Want to learn more about this role or our team?

Get in touch →