Job Description

Intuition Machines uses AI/ML to build enterprise security products. We apply our research to systems that serve hundreds of millions of people, with a team distributed around the world. You are probably familiar with our best-known product, the hCaptcha security suite. Our approach is simple: low overhead, small teams, and rapid iteration.

As a Site Reliability Engineer, you will focus on engineering solutions related to performance, availability, security, and cost-effectiveness. We consider these non-functional features to be core requirements for us and our customers. You will work at multiple layers of our internet-scale system (infrastructure, data, application logic) and build the solutions.

We have a flat and highly customer-focused process, so you should be comfortable interacting directly with engineers at our large enterprise customers and startups alike when necessary, in conjunction with product, customer success, and sales teams.

What you’ll do:

Work with large-scale systems (handling millions of requests per second, serving millions of users, across multiple cloud providers).
Develop solutions to enhance performance, availability, security, and cost-effectiveness.
Keep us up, keep us fast, and keep our dev teams productive ensuring that every peer release improves performance across the spectrum including quality, security, uptime, speed-to-deliver, threat detection, and customer engagement.
Source improvement ideas, priority and capabilities from customers, the internal community, new and existing system metrics. Make decisions rapidly.
Be creative and desire an environment where you can directly create value and be a force to improve the experience for our customers.

What we’re looking for:

Strong programming skills in one or more of the following languages: Python, JavaScript, Go.
Background in software engineering with expertise in backend development within Kubernetes-based systems.
Hands-on experience in development and orchestration within high-scale, high-uptime, and high-reliability environments.
Minimum of six years of hands-on experience in related roles (engineering, DevOps, SRE).
Familiarity with distributed systems, including queue-first architectures and sharding.
Demonstrated engineering expertise, including gathering requirements, problem-solving, and making recommendations.
Preferred: Familiarity with security frameworks, attack vectors, botnets, and impact analysis.

What we offer:

Fully remote position with flexible working hours.
An inspiring team of colleagues spread all over the world.
Pleasant, modern development and deployment workflows: ship early, ship often.
High impact: lots of users, happy customers, high growth, and cutting edge R&D.
Flat organization, direct interaction with customer teams.

We celebrate diversity and are committed to creating an inclusive environment for all members of our team.

Join us as we transform cyber security, user privacy, and machine learning online!