Job Description

Intetics Inc., a global technology company providing custom software application development, distributed professional teams, software product quality assessment, and “all-things-digital” solutions, is seeking a highly skilled and experienced Senior DevOps Engineer to join our dynamic team on a full-time basis.

About the Project
A fast-growing tech company is building an infrastructure layer for modern AI workloads — a globally distributed platform that provides scalable, cost-efficient, and reliable access to GPU computing resources.

The platform enables customers to run production-level inference workloads across a diverse network of providers, offering flexibility, performance, and resilience required for real-world AI applications.

Since its launch, the company has demonstrated strong traction, securing a significant Series A investment and achieving multi-million ARR within its first year of operation. As both customer demand and platform scale continue to expand, the team is actively growing its infrastructure capabilities to support the next stage of development.

About the Role
We are looking for a strong SRE / DevOps / Infrastructure Engineer to help scale and operate a distributed AI-focused infrastructure platform.

The system combines a cloud-based control layer (running on AWS, including EKS and managed MySQL) with a large fleet of GPU-powered nodes distributed across multiple external providers. These components are connected via a custom networking layer to ensure high availability and performance for production workloads.

Workloads are orchestrated with Kubernetes, while observability is built around Prometheus, Grafana, Loki, Jaeger, and OpenTelemetry, covering metrics, logging, and tracing across the platform.

While the control layer is relatively lightweight and cloud-native, the GPU infrastructure introduces additional complexity. It spans different providers and environments, often resembling distributed on-premise setups rather than standard cloud infrastructure, requiring a deeper understanding of networking, reliability, and systems behavior at scale.

This is a hands-on role focused on solving real infrastructure challenges across Kubernetes, networking, observability, and production operations.

You will join a small, high-impact infrastructure team (currently a couple of engineers) that is actively growing as the platform and customer base continue to expand. The goal is to strengthen the core infrastructure early and support further scaling.

What you’ll do

Build, operate, and improve the infrastructure powering Parasail’s distributed inference platform
Own reliability, scalability, and operational excellence across AWS-based control planes and our multi-provider GPU fleet
Design and maintain the networking layer connecting control planes, Kubernetes clusters, and geographically distributed GPU hosts
Operate and improve Kubernetes-based inference orchestration, primarily on EKS
Manage deployments and infrastructure changes using Helm, FluxCD, and Terraform
Improve observability across the platform using metrics, logs, traces, dashboards, and alerting built on Prometheus, Grafana, Loki, Jaeger, and OpenTelemetry
Tune alerts, improve runbooks, and strengthen operational readiness as the system scales
Respond to production issues, perform root cause analysis, and implement durable fixes
Work closely with engineers across time zones using clear asynchronous communication and handoff practices, especially through Slack
Help expand Europe-based infrastructure coverage to support sustainable operations outside US business hours

Requirements

5+ years of experience in SRE, DevOps, platform engineering, or infrastructure engineering
Strong production experience with networking and Kubernetes
Experience operating AWS infrastructure in production, especially EKS
Strong hands-on experience managing Linux hosts, clusters, and distributed systems in environments that are not fully abstracted by a major cloud provider
Experience with Prometheus, Grafana, Loki, Jaeger, and OpenTelemetry
Experience with deployment and GitOps workflows using tools such as Helm and FluxCD
Experience with infrastructure as code, ideally Terraform
Familiarity with alert tuning, runbook development, and practical incident management in production systems
Strong operational judgment: able to troubleshoot independently, respond calmly to incidents, and improve systems without constant direction
Comfortable working in a fast-moving startup where infrastructure, product, and customer demands are changing quickly
Clear communicator who can work effectively in an async environment and handle shift handoffs cleanly

Nice to have

Experience with AI inference, ML infrastructure, or adjacent high-performance distributed systems
Experience operating heterogeneous GPU fleets, bare-metal infrastructure, or multi-provider compute environments
Experience using AI tools productively in engineering workflows

Apply Now

Job Summary

Location
Remote Armenia
Category
DevOps Engineer SRE Engineer Software development
Date Posted
2 hours ago

1058 | SRE / DevOps / Infrastructure Engineer

Intetics Inc., a global technology company providing custom software application development, distributed professional teams, software product quality assessment, and “al

devops
engineer
aws

Senior Hybrid Cloud / DevOps Engineer

Company: An international engineering firm developing innovative utility-scale energy storage solutions and grid stability systems for the UK market.Product: A high-load

Senior
devops
aws
engineer
cloud

1025 | Senior DevOps Engineer

Intetics Inc., a leading American technology company specializing in custom software application development, distributed professional teams creation, software product qu

Senior
devops
engineer

Technical Lead, DevOps

About Atria Atria is a membership-based preventive health care practice delivering cutting-edge primary and specialty care in New York, South Florida, Los Angeles (2026),

Lead
devops
exec

1058 | SRE / DevOps / Infrastructure Engineer

Job Description

Remote Armenia

DevOps Engineer SRE Engineer Software development

2 hours ago

1058 | SRE / DevOps / Infrastructure Engineer

Senior Hybrid Cloud / DevOps Engineer

1025 | Senior DevOps Engineer

Technical Lead, DevOps

Find Remote Jobs

About us

Additional

1058 | SRE / DevOps / Infrastructure Engineer

Job Description

Remote Armenia

DevOps Engineer SRE Engineer Software development

2 hours ago

1058 | SRE / DevOps / Infrastructure Engineer

Senior Hybrid Cloud / DevOps Engineer

1025 | Senior DevOps Engineer

Technical Lead, DevOps

Subscribe to Job Alerts

Find Remote Jobs

About us

Additional