About HighLevel: HighLevel is an AI-powered business operating system that gives agencies, entrepreneurs and SMBs the infrastructure to build, automate and scale. Today, HighLevel supports SMBs across 150+ countries, fueling community-driven growth rooted in real customer outcomes. To date, businesses operating on HighLevel have generated over $7 billion in ecosystem value, demonstrating the impact of shared infrastructure at scale. By centralizing conversations, automation and intelligence into one system, we help businesses move faster, reduce complexity and execute efficiently. Behind the platform, HighLevel powers more than 4 billion API hits and 2.5 billion message events daily. With 250 terabytes of distributed data, 250+ microservices and over 1 million domain names supported, our architecture is built for performance, resilience and long-term scalability.
Our People With over 2,000 team members across 10+ countries, HighLevel operates as a global, remote-first organization built for speed and ownership. We value initiative, clarity and execution, creating space for ambitious people to build systems that support millions of businesses worldwide. Here, innovation thrives, ideas are celebrated and people come first, no matter where they call home.
Our Impact Every month, HighLevel enables more than 1.5 billion messages, 200 million leads and 20 million conversations for the more than 1 million businesses we support. Behind those numbers are real people building independence, expanding opportunity and creating measurable impact. We’re proud to be a part of that. Learn more about us on our YouTube Channel or Blog Posts
About the Role:
We are looking for a Lead DevOps Engineer to drive visibility, efficiency, and savings across our multi-cloud infrastructure. You will lead efforts to measure and reduce CPO (Cost Per Operation), help map infrastructure cost to product usage and revenue, and work closely with engineering teams to optimize their cloud footprints across GCP, AWS and many more.
Responsibilities:
Cloud Cost Optimization: Design and implement strategies to reduce operational costs across GCP, AWS, Firebase, and managed services like MongoDB Atlas and Elastic.co
Cost Observability: Build and maintain cost dashboards (using DoiT, GCP BigQuery, AWS Cost Explorer, etc.) to track spend across cloud providers and services at the product, team, and project level
CPO Analysis: Define and maintain Cost per Operation (CPO) metrics per product/service; collaborate with product owners to align infra cost with revenue generation
Multi-Cloud Governance: Set up policies, guardrails, budgets, and alerts to prevent cost overruns and enforce efficient resource usage across Kubernetes clusters and cloud services
Automation & Tooling: Develop automation scripts (using Python, Bash, or Terraform) to detect idle resources, right-size workloads, and enforce tagging strategies
Billing Analysis: Break down cloud bills by team, sub_team, project, and service using label-based and usage-based filtering; provide actionable insights and recommendations
Collaboration: Partner with finance, DevOps, platform, and product teams to tie infrastructure cost back to product growth and engineering impact
Cloud Architecture: Provide guidance on designing cost-efficient cloud-native systems and help teams adopt best practices for storage, compute, and networking usage
Requirements:
Experience: 7+ years in Cloud Engineering roles with a focus on cost optimization
Cloud Expertise: Deep hands-on experience with GCP (BigQuery, GKE, Firebase, Pub/Sub), AWS (EC2, S3, Lambda, EKS), and managed services like MongoDB Atlas, Elastic.co, ClickHouse
Cost Tools: Strong working knowledge of DoiT Cloud Analytics, GCP Billing Export, AWS CUR (Cost & Usage Reports), and BigQuery
Scripting & Automation: Proficient in Python, Bash, and automation frameworks for cost cleanup and reporting
Data-Driven: Comfortable querying and visualizing cloud billing data to derive unit economics (e.g., cost per user, per API call, per deployment)
Kubernetes Knowledge: Familiar with cost management in Kubernetes (e.g., node cost allocation, workload optimization, spot/preemptible usage)
Communication: Ability to translate complex cost insights into actionable plans for engineering and business stakeholders