As a Senior DevOps Engineer at Sword Health, you'll own and evolve the infrastructure that powers the world's leading AI Care platform. Working across a multi-cloud environment, you'll build and maintain the systems behind our frontends, backends, microservices, and data pipelines - collaborating closely with multiple engineering teams to keep everything reliable, scalable, and fast. You'll also interface with our AI teams as their models move into production, ensuring the infrastructure is ready to support them. If you thrive in cross-team environments and want your infrastructure work to directly impact healthcare at scale - we'd love to hear from you.
What you'll be doing:
Design, implement, and maintain scalable, resilient infrastructure to support Sword Health’s high-demand applications and services.
Automate and streamline deployment processes, CI/CD pipelines, and routine maintenance tasks to enhance efficiency and reduce downtime.
Monitor and optimize system performance, proactively identifying and resolving issues to ensure high availability and reliability.
Collaborate closely with development, data, and security teams to ensure seamless integration of infrastructure and code changes.
Drive security best practices by implementing and managing access control, network security, and compliance-related policies across the infrastructure.
Lead incident response and troubleshooting for infrastructure-related issues, ensuring rapid and effective resolution to maintain service continuity.
Mentor and guide junior team members, sharing DevOps best practices and fostering a culture of continuous learning and improvement within the team.
Stay up-to-date with industry trends and emerging technologies, bringing innovative solutions to Sword Health’s DevOps processes and toolchains.
What you need to have:
Experience with Linux environments.
Experience with DevOps and GitOps methodologies.
Experience with Kubernetes and Containerized applications (Docker).
Experience with Infrastructure as Code (Terraform).
Experience with Monitoring Tools (Google Cloud Monitoring/StackDriver, Grafana, Prometheus/AlertManager, NewRelic).
Experience with Jenkins.
Experience with CI/CD.
Team player, Solution-oriented, Proactive attitude with “Get Things Done” mindset.
Enthusiast and interested in technologies and innovation.