Ciandt

[Job - 28525] AI Quality Engineer Senior, QA

Apply Now

Job Description

Role Overview 

The AI Agent Evaluation Engineer is responsible for ensuring the quality, accuracy, explainability, and reliability of AI agent systems across Proof-of-Concept, Pilot, and Production. The role focuses on establishing enterprise-grade evaluation frameworks for agentic AI, LLMs, and AI-driven workflows to ensure outputs are trustworthy, measurable, and continuously improving. 

Key Responsibilities 

• Design and implement evaluation frameworks for AI agents, LLMs, and RAG-based systems. 

• Measure accuracy, relevance, consistency, hallucinations, and task success across AI outputs. 

• Establish baseline and comparative evaluations across models, prompts, and agent strategies. 

• Validate agent decision logic, reasoning paths, and tool usage for explainability and traceability. 

• Support human-in-the-loop (HITL) evaluation for high-impact or high-risk use cases. 

• Partner with engineering teams to improve prompts, retrieval strategies, and agent orchestration. 

• Validate AI observability, monitoring, drift detection, and regression controls. 

• Support vendor PoCs, pilots, and RFP evaluations with fact-based assessments. 


Required Qualifications
• Experience evaluating Generative AI, LLMs, and agentic AI systems. 

• Strong understanding of AI/ML evaluation metrics and error analysis. 

• Hands-on experience with Python and AI evaluation workflows. 

• Familiarity with RAG architectures, prompt evaluation, and agent orchestration. 

• Experience with cloud AI platforms (Azure or GCP preferred). 

 

Preferred Qualifications  
• Experience in Education, Healthcare, or other regulated domains. 

• Exposure to synthetic data generation and test scenario design. 

• Familiarity with AI governance, risk, and compliance practices. 


Success Measures

• Measurable improvement in AI accuracy, reliability, and trustworthiness. 

• Clear visibility into why AI agents made specific decisions. 

• Standardized evaluation frameworks adopted across AI initiatives. 

• Increased leadership confidence in AI-driven outcomes.