Staff Software Engineer, Machine Learning Infrastructure

Job Description

About the Machine Learning Infrastructure Team

At Thumbtack, we're solving complex technical challenges across search, ranking, recommendations, pricing optimization, and spam detection. Our ML Infrastructure team leads the architectural vision and implementation of enterprise-wide machine learning capabilities, enabling teams to effectively experiment with and deploy ML models at scale. We're building next-generation infrastructure that powers Thumbtack's AI-first future. For insights into our engineering challenges, visit our engineering blog.

Challenge 

As a Principal ML Infrastructure Engineer, you'll drive the technical vision and strategic direction of Thumbtack's machine learning platform. You'll architect solutions that democratize ML capabilities across the organization while establishing best practices and technical standards. Working closely with senior leadership, you'll shape our technical roadmap for generative AI adoption, feature platform evolution, and ML operational excellence.

Responsibilities

  • Define and drive the technical vision and architecture for Thumbtack's next-generation ML infrastructure
  •  Lead cross-functional initiatives spanning engineering, data science, and product teams to build scalable, enterprise-grade ML systems
  •  Architect and oversee implementation of critical ML infrastructure components including model serving systems and RAG systems that can scale. 
  •  Establish technical standards and best practices for ML engineering across the organization
  •  Mentor and provide technical leadership to engineering teams on ML infrastructure best practices
  •  Partner with senior leadership to align ML infrastructure capabilities with business objectives

What you’ll need

If you don't think you meet all of the criteria below but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join the team.

  •  8+ years of engineering experience with significant focus on distributed systems
  •  4+ years of hands-on experience building ML infrastructure or ML platforms at scale
  •  Deep expertise in at least one major programming language; proficiency in our core stack (Go, Python) preferred
  •  Proven track record of technical leadership on complex, cross-functional projects
  •  Strong architectural skills with experience designing scalable, reliable distributed systems
  •  Deep understanding of ML workflows, common frameworks, and operational challenges
  •  Experience mentoring teams and driving engineering excellence
  •  Track record of making strategic technical decisions with organization-wide impact

Bonus points if you have

  •  Experience building AI platforms that support hundreds of models in production
  •  Deep expertise with modern ML frameworks (PyTorch, TensorFlow) and MLOps tools
  •  Experience implementing generative AI capabilities at enterprise scale
  •  Track record of building high-performing technical teams
  •  Expertise with cloud-native architectures and major cloud providers (AWS, GCP)
  •  Experience driving technical strategy at fast-growing technology companies

Thumbtack is a virtual-first company, meaning you can live and work from any one of our approved locations across the United States, Canada or the Philippines.* Learn more about our virtual-first working model here.

For candidates living in San Francisco / Bay Area, New York City, or Seattle metros, the expected salary range for the role is currently $238,000 - $308,000. Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.

For candidates living in all other US locations, the expected salary range for this role is currently $202,300 - $261,800. Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.

#LI-Remote