MLE / MLOps

White Circle

🏠 Remote

🏠 Remote

✈️ Relocation

✈️ Relocation

Apr 17, 2026

Job description

Role details

Location: Paris
Employment Type: Full time
Location Type: Hybrid
Department: Research team
Compensation: $100K – $150K • Offers Equity • Relocation package

What you'll do

Own inference infrastructure end-to-end: optimize latency, throughput, and cost across our model fleet.
Build and scale model serving with TensorZero, vLLM/SGlang/TRT, and Kubernetes.
Design and maintain vector search pipelines with Vector storages.
Familiarity with support metrics (SLAs, FCR, deflection) and ability to define service health KPIs.
Turn research into product: grab experimental models from the research team, figure out what's production-ready, and ship it - formatting, sampling parameters, deployment, the whole thing

Who you are

3+ years shipping high performance ML systems in production, not just training notebooks
Deep hands-on experience with inference optimization - you've debugged latency spikes and know the difference between theoretical and real-world throughput
Comfortable across the stack: from CUDA kernels to Kubernetes manifests to Grafana dashboards
A big plus: experience with Rust, custom Triton kernels, benchmarks

Tech stack

TensorZero
vLLM/SGlang/TRT
Kubernetes
CUDA
Rust (bonus)
Custom Triton kernels (bonus)
Grafana dashboards

Team description

Department: Research team

Benefits & perks

Salary of $100,000 to $150,000 + equity
20 days of paid vacation
Work from Paris (hybrid) + relocation package
Best medical insurance in France
All the hardware, tools, and services you need
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: Alps and Saint-Tropez

Contact info:

Contact info:

Apply for this job

Please mention "I found this job at Remocate!"

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Latest jobs

No vacancies in this category