Senior ML Ops Engineer ›
Software Engineering, Operations, Data Science
Prague, Czechia
Responsibilities
Setup and maintain LLM observability frameworks/tools
Help improve data annotation tooling
Ensure stability of LLM calls (rate limits, provisioned throughput, backups, …)
Help to drive security review processes for AI vendors and providers
LLM cost optimization recommendations (caching, batching, identification of workflow parts causing high costs, etc.)
Hosting finetuned/open weight machine learning models
Helping with LLM evaluations (tooling/framework) with the current main focus on agentic evals
Platform tooling for enabling non-technical people (e.g. PMs) to iterate on prompts
Qualifications
5+ years building and operating software systems end-to-end
Hands-on experience with ML infrastructure: model serving, training pipelines, or LLM integrations in production
Strong understanding of cloud infrastructure and distributed systems (primarily AWS)
Familiarity with observability tooling and cost management for LLM workloads
Experience with or openness to: Python, Kubernetes, Terraform
Thrives in a remote-first, async environment: clear communicator, high ownership, low ego
Bonus: experience with eval frameworks, annotation tooling, or prompt management platforms