Loadsmart

Site Reliability Engineer, Tech Lead

Apply Now

Job Description

In the SRE, Tech Lead role you will build and maintain the company's internal platform, driving operational excellence and empowering the entire engineering team. You should have experience in analyzing, proposing, and implementing safer systems and processes. Collaborating closely with engineering squads across platform engineering, you will ensure our applications are both safe and reliable. As a hands-on leader, you will stay actively involved in technical work while collaborating closely with internal stakeholders and engineering squads across the organization to ensure our applications are both safe and reliable.

DEPARTMENT: Engineering

LOCATION: Anywhere in Brazil - Remote



WHAT YOU GET TO DO
  • Collaborate with and support our creative, tight-knit development team.
  • Design, deploy, and operate Loadsmart's critical systems while balancing reliability, cost, and agility.
  • Play a key role in driving reliability projects with engineering teams.
  • Utilize your intuitive problem-solving skills and contagious positive attitude to tackle challenging and exciting issues, inspiring those around you.
  • Collect metrics and understand their business impact, encouraging the team to do the same.
  • Perform troubleshooting and root-cause analysis of system operation issues.
  • Be accountable for the platform's Service Level Agreements and Objectives.
  • Provide infrastructure support during off-hours as needed
  • Take ownership of software infrastructure projects
  • Seek, give, and receive constructive feedback through code and specification reviews.

  • REQUIRED QUALIFICATIONS:
  • 2+ years of experience in a Tech Lead or similar leadership role.
  • Over 5 years of experience in Cloud Computing, SRE/DevOps.
  • Proven experience collaborating with internal stakeholders across multiple engineering squads.
  • Strong project management skills with a demonstrated ability to delegate and mentor team members.
  • Proficient in English communication (both written and spoken) to collaborate in an international team with native and non-native English speakers.
  • Detail-oriented with high initiative and self-motivation.
  • Strong understanding of software engineering principles and how systems work under the hood.
  • In-depth knowledge of modern networking and operating systems.
  • Proficiency in AWS, cloud environments, containers, Kubernetes, Docker, and DevOps engineering, including managing tests and CI/CD pipelines.
  • Familiarity with automation tools and provisioners like Terraform, Ansible, or Chef.
  • Solid troubleshooting and system engineering experience in UNIX/Linux production environments.
  • Experience with monitoring, alerting, and incident management 
  • Proficiency in automating tasks with scripting languages like Python, Bash, etc
  • Experience or exposure to PostgreSQL and DBA responsibilities is a plus.