Site Reliability Engineer, Embedded

Job Description

About the Role: 

As a Site Reliability Engineer on the Embedded team, your role will be crucial in helping us design, scale, and manage our growing AWS-backed services for millions of connected IoT devices. Your expertise in cloud-native and highly elastic service design and scaling practices is going to ensure our growing services, as well as new products and features operate smoothly and without manual intervention to achieve Motive’s strong 99.99% availability SLOs. Leveraging and advancing our robust and fully-codified infrastructure and Kubernetes environment, paired with AWS components that require thoughtful implementations, and of course advanced troubleshooting with teams, you can be a large part of Motive’s growth to the next million devices and beyond.

What You’ll Do: 

  • Collaborate with other engineering and product teams to design and build the infrastructure and services required to deliver new features to customers in a cloud-native and event-driven fashion.
  • Leverage and progress our IaC (Terraform) and CM (Helm) code and strategies for advanced scaling and self-service usage by engineering teams.
  • Identify and remove bottlenecks from systems in production throughout AWS services and with our Kubernetes platform.
  • Ensure 99.99% customer-facing uptime.
  • Continuously improve the monitoring and alerting capabilities of our platform, enabling us to be proactive instead of reactive.
  • Be a beacon of information for engineering re: scaling, architecture, and observability through guides, codification, brown bags, and Tech Talks.

What We’re Looking For: 

  • 4+ years of professional SRE/DevOps experience, and a demonstrated ability working on high volume production systems
  • Experience with HPA and other scaling experience with Kubernetes.
  • Advanced knowledge of AWS services and technologies (ALB/ELB, IAM permissions, DynamoDB, SNS, EKS/Fargate, etc.)
  • Experience with infrastructure as code and configuration management (Terraform and Helm charts especially) to design and provision new services
  • Knowledge of Python, Bash or other scripting languages. Knowledge of Ruby or Golang is a plus.
  • High-level of ownership and drive to work with others and see improvements through to production.