Fliff is pioneering play-for-fun sports gaming, and we're looking for a talented Staff DevOps Engineer to help us power real-time, high-traffic user experiences. As a mobile game development expert, you'll thrive in a fast-paced environment, solving complex distributed systems problems and partnering closely with Product, Engineering, and Security teams.
Who We're Looking For
We need someone who is passionate about building scalable, reliable, and secure platforms that support large volumes of real-time traffic. You should excel at solving complex distributed systems problems, enjoy working in a fast-paced environment, and have a platform mindset with an automation-first philosophy. Additionally, you'll drive architectural decisions that improve engineering velocity and system reliability.
What You'll Do
As our Staff DevOps Engineer, you'll:
- Architect, build, and operate highly available, high-performance infrastructure to support large volumes of real-time traffic during peak sports windows.
- Lead the design and development of internal platforms, tooling, and automated workflows that accelerate engineering productivity.
- Own and improve observability (metrics, logs, traces), monitoring, and alerting to ensure fast detection and resolution of issues.
- Drive incident response, root-cause analysis, and reliability improvements across the engineering organization.
- Enhance and maintain our Kubernetes-based platform, including Helm charts, multi-environment pipelines, cluster upgrades, and operational hardening.
- Implement and evolve security best practices around IAM, network architecture, secrets management, and infrastructure governance.
- Build and maintain CI/CD pipelines that support safe, rapid, and stable deployments.
- Partner with engineering leadership on capacity planning, cost optimization, and infrastructure roadmap planning.
- Mentor and guide engineers across teams, influencing architectural direction and DevOps best practices.
Required Skills & Experience
We're looking for someone with:
- 7+ years of SRE/DevOps/Cloud Infrastructure experience supporting production systems at scale.
- Proven experience running distributed systems and microservices in high-traffic or high-availability environments.
- Deep knowledge of Kubernetes, Helm, and cloud-native architecture (preferably AWS).
- Strong proficiency in Go or another backend programming language.
- Exposure to PostgreSQL, Redis, DynamoDB, or Cassandra performance tuning in a production environment.
- Deep familiarity with event-driven systems, streaming pipelines, or messaging platforms such as Kinesis, Kafka, Pub/Sub, or RabbitMQ.
- Advanced hands-on experience with Terraform or a comparable IaC solution.
- Strong practical understanding of observability stacks (Datadog, Prometheus, Grafana, OpenTelemetry, etc.).
- Experience implementing SLOs, SLIs, error budgets, and reliability-driven engineering practices.
Benefits
As part of our team, you'll enjoy:
- An annual salary ranging from $180,000 to $220,000, depending on experience and background.
- Unlimited/Flexible Time Off: Flexible vacation policy
- Health benefits with 100% paid premiums* for medical, dental, and vision plans for employees and dependents, plus an on-demand healthcare concierge.
- Pre-tax savings plans for healthcare, with up to a $500 annual employer contribution to the HSA (if enrolled in the HSA medical plan).
- Employee-sponsored 401(k) to help reach your financial goals.
- Fully remote work environment.
- Generous parental leave.
- Professional development opportunities in a dynamic, global setting.
Perks
You'll also enjoy:
- Work Remotely.
- $500 work-from-home stipend + Equipment & Accessories.
- Opportunity for professional development in a dynamic, global setting.
- A supportive, collaborative, and knowledge-driven workplace.
- An engaging and challenging role with the freedom to innovate and develop effective solutions.