Principal Site Reliability Engineer

Job Description

Location & Notice

This role is only available to current residents of Ontario, who are eligible to work in Canada (citizens, permanent residents, valid existing work permits). 

About

Making fans - That’s what we’re all about! Since our humble beginnings in 2011, these two words have driven Big Viking Games to become the successful company it is today. We are focused on making our mark as a next-generation, live operations games company.

We believe that each Viking is unique and brings exponential value to the company. In return, we provide Vikings with the opportunity to solve problems through continuous learning and fostering feedback, ultimately mastering their craft. By investing in our Vikings and in new technologies, we can continue our success for years to come.

Description

Big Viking Games is looking for a Site Reliability Engineering expert with experience in Security to join our team! You will design, develop, and maintain an infrastructure that is able to support hundreds of thousands of active players from our flagship games - YoWorld and FishWorld. You will play a critical role in building our next generation game platform, ensuring scalability and security.

What You'll Do:

  • Own and operate our cloud infrastructure (AWS), deployment pipelines (Github / ArgoCD), and production clusters (K8)
  • Design and implement the next iteration of our infrastructure that will enable us to serve 10x active users
  • Develop and oversee our incident management strategy, including observability, failover, recovery, post mortems, and long-term improvement of DORA metrics
  • Audit, develop, and drive our security strategy, including improving our current practices and enabling future compliances with established requirements (i.e. GDPR, CCPA)
  • Lead vendors / external contractors in large-scale transformation projects

Requirements

  • 10+ years of hands-on experience managing cloud infrastructure, ideally in AWS
  • 3+ years of experience operating Kubernetes/Docker in production, including deployment, scaling, and troubleshooting
  • Extensive experience with CI/CD tools such as Jenkins, GitLab CI, or similar
  • Expert-level knowledge of SRE best practices (incident management, DORA metrics, SLO/SLI etc)
  • Extensive experience with scripting, and with infrastructure-as-code tools such as Terraform
  • Experience building highly available and observable systems at scale

Benefits

  • Competitive salary, bonus program, Group Retirement Savings Plan matching & participation in the Employee Stock Option Plan
  • Comprehensive benefits package (health, dental, and vision), including generous time off policies

Accommodations Notice:

Big Viking Games is committed to providing accommodations for people with disabilities. If you require an accommodation, we will work with you to meet your needs. Accommodation may be provided in all parts of the hiring process.