We are seeking an 8-12 years experienced AI Validation Engineer to lead validation, benchmarking, and quality assurance activities for AI/ML software stacks running on embedded and heterogeneous computing platforms. The ideal candidate will possess strong expertise in AI frameworks, ROCm ecosystems, Linux-based environments, performance analysis, and automation. This role will drive end-to-end AI pipeline validation while collaborating closely with architecture, compiler, runtime, driver, and hardware teams to ensure production-quality AI solutions.
Key Responsibilities
AI/ML Validation & Quality Ownership
· Lead validation efforts for complex AI/ML compute stacks across multiple hardware and software platforms.
· Define validation strategies, test plans, methodologies, and quality metrics for AI software and system pipelines.
· Own the complete defect lifecycle, including issue reporting, triage, root-cause analysis, tracking, and closure.
· Ensure comprehensive coverage across functional, performance, regression, stress, scalability, and reliability testing.
End-to-End AI Pipeline Validation
· Validate complete AI workflows across training, optimization, and inference pipelines.
· Validate ROCm libraries and AI software stack functionality.
· Verify:
o Model training, conversion, and optimization workflows (e.g., PyTorch to ONNX)
o Inference runtimes such as ONNX Runtime, TensorRT, ROCm/HIP, and OpenVINO
o AI compilers and toolchains including TVM, Vitis AI, XDNA, and XLA
o Kernel execution, memory movement, inference correctness, and accuracy
· Validate AI workload stability, performance, and correctness on Ubuntu and Yocto-based Linux platforms.
AI Framework & Compute Stack Validation
· Validate functionality, integration, and performance of AI frameworks including:
o PyTorch
o TensorFlow
o ONNX Runtime
· Execute and validate workloads across heterogeneous compute environments utilizing:
o ROCm/HIP
o CUDA
o OpenCL
o AI accelerators
· Analyze the impact of framework, compiler, and runtime changes on real-world AI workloads.
Required Skills & Qualifications
Technical Expertise
· 8–12 years of experience in AI/ML validation, performance analysis, or software quality engineering.
· Strong understanding of:
o Deep Learning
o Large Language Models (LLMs)
o Recommender Systems
· Strong hands-on experience with ROCm technologies and ROCm stack validation.
· Experience validating AI/ML compute stacks including:
o HIP
o CUDA
o OpenCL
o OpenVINO
o PyTorch and TensorFlow ecosystems
· Expertise in end-to-end AI pipeline validation including:
o Model conversion
o Inference runtimes
o AI compilers
o Kernel execution
o Accuracy validation
· Advanced Python programming and scripting skills.
· Strong experience in AI benchmarking, profiling, and performance optimization.
· Deep understanding of Linux environments, particularly Ubuntu and Yocto.
Validation & Quality Engineering
· Strong experience with software validation methodologies, SDLC processes, and defect management.
· Experience with production-quality software validation and release qualification.
· Strong focus on reproducibility, test coverage, performance validation, and release readiness.
· Ability to independently drive validation initiatives with strong ownership and accountability.
Preferred Qualifications
· Experience benchmarking and optimizing AI workloads on heterogeneous platforms including CPUs, GPUs, and AI accelerators.
· Experience tuning large-scale AI models, including:
o Memory optimization
o Mixed-precision execution
o Inference acceleration
· Familiarity with open-source AI ecosystems and performance-focused projects.
· Exposure to embedded AI platforms and edge AI deployments.
Desired Attributes
· Strong analytical and performance-focused problem-solving mindset.
· Excellent communication and stakeholder management skills.
· Proven ability to lead technically complex validation programs.
· Ability to work effectively in fast-paced, globally distributed engineering environments.
Educational Qualifications
· Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, Electronics Engineering, Robotics, or a related field.
Supplier Notes
· Strong experience in AI Validation, ROCm Validation, and AI Performance Benchmarking is mandatory.
· Candidates must have hands-on expertise with PyTorch, TensorFlow, ONNX Runtime, ROCm/HIP, and Linux (Ubuntu/Yocto) environments.
· Preference will be given to candidates with experience in LLMs, AI accelerators, heterogeneous compute platforms, and performance optimization.
· Strong Python automation and validation framework development experience is required.
Pay: ₹2,500,000.00 - ₹4,000,000.00 per year
Work Location: In person