As a Staff DevOps Engineer, you will help drive the overall architecture for the Agiloft CloudOps systems. This includes designing systems that are scalable and resilient utilizing DevOps and security best practices. Additionally, this role will collaborate cross-functional and lead DevOps initiatives, providing expertise to drive the departmental goals. As an expert on the CloudOps team, this individual may assume a team lead role for various projects and initiatives and advising on complex technical issues.
Job Responsibilities
Architect and implement highly available, scalable, performant, maintainable and resilient systems using industry standard tools and frameworks.
Define and enforce best practices for CI/CD, IaC, and automation.
Leverage AI-assisted tools to enhance productivity, improve processes and workflows, and overall support everyday work.
Design and implement advanced security measures, including identity and access management.
Collaborate with cross-functional teams and guide DevOps initiatives.
Conduct performance analysis and optimization for complex systems.
Lead major infrastructure migrations and upgrades.
Develop and maintain documentation and standards.
Participate in on-call rotation
Other duties as assigned
Required Qualifications
Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent experience).
7+ years of related experience
Expertise in DevOps principles, practices, and technologies including Amazon Web Services (AWS) and Terraform or other Infrastructure as Code (IaC)
Advanced knowledge of Linux operating systems and troubleshooting OS issues
Highly skilled in setting up and managing monitoring tools (such as Prometheus, Grafana, Datadog, Nagios, Open Telemetry, ELK, or similar tools)
Advanced knowledge of scripting languages and automation utilizing Python, Bash or Ruby
Proficiency in using relevant AI Tools in the SLDC (for example, GitHub Copilot, JetBrains AI Assistant)
Deep understanding of:
Networking concepts and principles
Version Control Systems (such as Git)
CI/CD tools such as Jenkins, Gitlab CI/CD, Github, or similar tool
Containerization and orchestration (Docker, Kubernetes)
Expertise with cloud platforms (AWS, Azure, or Google Cloud).
Superior problem-solving, troubleshooting/debugging skills, and communication skills.
Proven experience in security best practices including identity and access management, encryption, and vulnerability assessments
Eagerness to learn and adapt to new technologies and tools.
Participate in on-call rotation schedule with the rest of the team
Experience with adopting AI to supplement code development (Cursor, Copilot)
Experience with managing and optimizing vector databases for AI/ML (e.g., PostgreSQL vectordb)
Preferred Qualifications
AWS certifications
Knowledge of AWS Security Architecture
Knowledge of Agile methodology
To Move to the Next Level
Champion, design and implement architectures and tools/processes of high significance that have a material impact on the team and business operations (replacing manual execution, significantly reducing operation costs, improving customer satisfaction)
Demonstrates mastery and acts as a technical mentor in areas of specialization, helping other team members learn and grow in those areas
Leader and influencer on positive team morale and a continuous improvement mindset.