Job Description

Who We Are

Our goal is to save ten-million hard-working employees ten-billion dollars. We are a values driven, well-funded, and fast-growing Financial Technology and HR company. We want to empower small and midsize businesses with financial tools that make them the place where people want to work.

We’ve created a financial empowerment platform that helps small but mighty HR teams make a big impact on employee financial wellness. ZayZoon is quickly becoming the employee financial wellness super-app that employees can’t live without, and employers are clamoring to offer to help attract and retain talent.

We are growing fast and have been recognized for rapid growth in the 2023 Deloitte Technology Fast 500 and Canadian Technology Fast 50 program! You can read more about it here.

About the Role

We are looking for a Senior Site Reliability Engineer to take ZayZoon’s cloud infrastructure to the next level with complex AWS builds, infrastructure-as-code, and observability/logging/APM solutions. You'll work in an embedded reliability team, alongside app and data engineers, to monitor, benchmark, and scale Zayzoon’s products. You will work with first class technologies and staff to leverage all the goodies AWS has to offer, as well as creating a bridge between our bare metal infrastructure and our Ruby on Rails production app. Predictability, reliability, and scalability are your three favourite words.

Develop and maintain infrastructure-as-code CloudFormation templates, emphasizing serverless resources (ECS, Fargate, lambda)
Instrumentation and daily metrics analysis of both infrastructure performance and our Ruby on Rails applications, using AWS tooling (Athena, CloudTrail, etc) and third party observability platforms (DataDog, OTel)
Manage deployment pipelines, including blue/green and intelligent auto-scaling
Maintain and stay ahead of resource dependencies, particularly database (RDS, Redshift), including updates, playbooks, downtime planning
Project costs and implement AWS cost savings programs and reserved instances
Work alongside our risk and security teams to ensure ongoing SOC-2 and cybersecurity compliances
Extensive collaboration with app developers on shared metrics, database performance, load testing
Extensive collaboration with data engineers on facilitating data warehouse development, ELT, ETL
Participating in our agile development process: sprint planning, story grooming and stand ups
Adherence to our SDLC and secure coding practises and environment

Requirements

5+ years infrastructure experience

2+ years AWS experience including certification and deployment of production applications
Proficiency with IaC, specifically CloudFormation
Experience with containerization (Docker, ECS, ECR)
Experience analyzing and acting on performance issues using observability platforms (DataDog, NewRelic, OTel)
Has the ability to build quick when we need to experiment and build clean when MVP becomes core functionality
Has strong SQL and data analysis skills and an eagerness to dig into data as part of problem solving

Benefits

Candidates must be located in Canada to be considered.

We are organized as a remote team, as such we are looking for candidates who can work effectively remotely. You must have access to a secure high speed internet connection and a secure workspace to ensure security of private information. This role is available on a permanently remote basis.

Please be aware that as part of our final hiring process, we will conduct reference calls with previous managers and possibly other individuals. Additionally, due to the nature of our business, a criminal record check and a basic security clearance will also be required.

We wish to thank all qualified applicants for their interest in joining our team!

#LI-REMOTE

Datavisor

Senior Site Reliability Engineer - US/Canada

DataVisor is a next generation security company that utilizes industry leading unsupervised machine learning to detect fraudulent activity for financial transactions, mob;

admin
engineer
senior

Grnet S A

Site Reliability Engineer (Senior)

About GRNETGRNET - National Infrastructures for Research and Technology, is an entity of the Greek Government, operating under the Ministry of Digital Governance. It prov;

engineer
admin
senior

Arista Networks

Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity

Company DescriptionArista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us a;

engineer
devops
admin

Intuition Machines Inc

Site Reliability Engineer

Intuition Machines uses AI/ML to build enterprise security products. We apply our research to systems that serve hundreds of millions of people, with a team distributed a;

admin
engineer

Senior Site Reliability Engineer

Job Description

Canada Only

SRE Engineer

2 days ago