MLE / MLOps

White Circle
🏠 Remote
🏠 Remote
✈️ Relocation
✈️ Relocation
Apr 10, 2026
Job description

Role details

  • Location: Paris
  • Employment Type: Full time
  • Location Type: Hybrid
  • Department: Research team
  • Compensation: $100K – $150K • Offers Equity • Relocation package

What you'll do

  • Own inference infrastructure end-to-end: optimize latency, throughput, and cost across our model fleet.
  • Build and scale model serving with TensorZero, vLLM/SGlang/TRT, and Kubernetes.
  • Design and maintain vector search pipelines with Vector storages.
  • Familiarity with support metrics (SLAs, FCR, deflection) and ability to define service health KPIs.
  • Turn research into product: grab experimental models from the research team, figure out what's production-ready, and ship it - formatting, sampling parameters, deployment, the whole thing

Who you are

  • 3+ years shipping high performance ML systems in production, not just training notebooks
  • Deep hands-on experience with inference optimization - you've debugged latency spikes and know the difference between theoretical and real-world throughput
  • Comfortable across the stack: from CUDA kernels to Kubernetes manifests to Grafana dashboards
  • A big plus: experience with Rust, custom Triton kernels, benchmarks

Tech stack

  • TensorZero
  • vLLM/SGlang/TRT
  • Kubernetes
  • CUDA
  • Rust (bonus)
  • Custom Triton kernels (bonus)
  • Grafana dashboards

Team description

Department: Research team

Benefits & perks

  • Salary of $100,000 to $150,000 + equity
  • 20 days of paid vacation
  • Work from Paris (hybrid) + relocation package
  • Best medical insurance in France
  • All the hardware, tools, and services you need
  • Covered subscriptions for AI agents and IDEs
  • Team off-sites twice a year: Alps and Saint-Tropez
Contact info: 
Contact info: 
Apply for this job
Please mention "I found this job at Remocate!"
Latest jobs
No vacancies in this category