hero
companies
Jobs

Site Reliability Engineer (Salt Lake City) ›

Filevine

Filevine

Software Engineering
Salt Lake City, UT, USA
Posted on Dec 23, 2025

Responsibilities

  • Design, implement, and maintain highly available, scalable infrastructure on AWS.
  • Automate infrastructure provisioning, deployment, and monitoring using IaC tools (e.g., Terraform, CloudFormation).
  • Monitor system health, performance, and capacity; proactively identify and resolve issues.
  • Participate in on-call rotation to respond to and resolve production incidents.
  • Collaborate with development teams to improve observability, logging, and alerting.
  • Drive continuous improvement in reliability through chaos engineering, load testing, and post-incident reviews.
  • Ensure security best practices and compliance requirements are embedded in our infrastructure.
  • Optimize costs while maintaining performance and reliability standards.

Qualifications

  • 3-5 years of experience in Site Reliability Engineering, DevOps, or similar roles.
  • Deep expertise with AWS services (e.g., EC2, ECS/EKS, RDS, Lambda, S3, VPC, CloudWatch, etc.).
  • Proficiency in infrastructure as code (Terraform preferred) and CI/CD pipelines.
  • Strong scripting/programming skills (e.g., Python, Bash, Go).
  • Experience with monitoring and observability tools (e.g., Datadog, Prometheus, Grafana, ELK stack).
  • Solid understanding of networking, Linux systems, and container orchestration (Dockers).
  • Proven ability to troubleshoot complex, distributed systems issues.
  • Bachelor's degree in Computer Science, Engineering, or equivalent experience.

Nice-to-have

  • AWS certifications (e.g., Solutions Architect, DevOps Engineer).
  • Experience in SaaS environments or regulated industries (legal tech a plus).
  • Familiarity with microservices, serverless architectures, and database reliability.
  • Passion for building resilient systems that support high-stakes workflows.