Alex Staff Agency

Senior DevOps Engineer - Highload, Cloud & Data-Intensive Systems (EU / Remote)

Salary ? Salary range shown is either directly from the job description or estimated based on typical salaries for similar roles in this industry. This estimate aims to give a general idea of the expected compensation for the position.
$64800 - $103200

Job Description

About the project
The team develops and maintains distributed services around analytics, APIs, and transaction monitoring. The systems process very large volumes of data — terabytes of storage, trillions of records, continuously growing load.

Infrastructure:

~100 servers (bare metal + VPS)
active use of IaC
Kubernetes clusters in production
focus on stability, observability, and automation

The project is long-term — not a hype startup, but a mature product with real users.

What the work looks like
This is a hands-on role with a clear time allocation:

60% — operations and incidents (including helping teams)
20% — infrastructure automation
20% — prototyping, improvements, technical initiatives

There is on-call responsibility, but normally after-hours incidents happen 2–3 times a year, not every week.

Responsibilities
Operation of production services and infrastructure (server provisioning/decommissioning, updates, replacements, performance troubleshooting)
Support and development of Infrastructure as Code (Terraform / Ansible: modules, roles, standards, reviews)
Monitoring, alerting, backups, and regular recovery checks
Development of service and infrastructure automation
Development of CI/CD and release procedures
Incident diagnosis and resolution, support for product teams
Traffic analytics, bot and attack protection tools
Responsibility for 24/7 platform stability

Requirements

What’s important
4+ years of experience operating Linux/Ubuntu infrastructure and production services
Strong understanding of networking and troubleshooting
Kubernetes (cluster operations), Rancher, Docker / containerd
Hands-on experience with Ansible and Terraform
Monitoring: Prometheus / Thanos / Telegraf / Grafana / Sentry
CI/CD: Jenkins
Automation: Bash, Python
Experience working with LVM

Nice to have
Experience working with blockchain nodes
Diagnosis and tuning of ClickHouse and MongoDB in high-load clusters
Providers: Hetzner / OVHcloud
Cloudflare (edge, DDoS), experience with AWS
Handling abuse tickets with hosting providers

Technology stack
VPN: WireGuard, OpenVPN
Databases: ClickHouse, MongoDB, Redis, PostgreSQL
Applications: Node.js (pm2), php-fpm, Lua, Tarantool
Supporting services: Go (operatorSDK), Ruby, Node.js, PHP

Benefits

5,000 – 8,000 € net

Format: office / hybrid / remote

Location: Spain (Barcelona and suburbs) or remote (CET ±2)

Full-time

Opportunity to genuinely influence architecture and processes

Mature engineering team and reasonable expectations