Talentica Software, started by industry veterans and ex- IITB grads, is a product engineering company that helps tech-native enterprises and startups turn their ideas into market-leading products. We deliver innovative, high-quality products at an accelerated pace by combining the product mindset of our human experts with the power of AI.
Over the last 22 years, the company has worked with over 200+ startups, with most clients based in the US, ensuring many successful exits.
What we're looking for?
A QA Automation Engineer with strong experience in LLMs and GenAI who can ensure the accuracy, stability, and performance of AI-driven applications.
If you have a strong understanding of how LLMs interact with data pipelines—covering indexing, chunking, embeddings, cosine similarity and keyword search —along with hands-on experience in LLM observability, prompt evaluation, and QA automation —you’ll fit right in.
What you’ll do?
Design and execute QA strategies for LLM-based and search-driven products.
Validate data pipelines involving indexing, chunking, embeddings, cosine similarity and keyword search.
Evaluate retrieval-augmented generation (RAG) and recommendation system quality using precision, recall, and relevance metrics.
Develop prompt test suites to measure accuracy, consistency, and bias.
Monitor LLM observability metrics such as latency, token usage, hallucination rate and cost performance.
Automate end-to-end test scenarios using Playwright and integrate with CI/CD pipelines.
Collaborate with ML engineers and developers to improve model responses and user experience.
Contribute to test frameworks and datasets for LLM regression and benchmark testing.
To be successful in this role, you should have
Qualification:
BE/BTech in Computer Science, Data Engineering, or a related field from a top institute (like IIT, NIT, BITS, etc.).
Experience:
3.5 to 5.5 years of experience in QA engineering, with at least 1+ years of experience in GenAI or LLM-based systems.
Skills
Strong understanding of indexing, chunking, embeddings, similarity search, and retrieval workflows.
Experience with prompt engineering, LLM evaluation, and output validation techniques.
Proficiency with Playwright, API automation, and modern QA frameworks.
Knowledge of observability tools for LLMs
Solid scripting experience in Python.
Knowledge of different LLM providers (OpenAI, Gemini, Anthropic, Mistral, etc.)
Exposure to RAG pipelines, recommendation systems, or model performance benchmarking.
Strong analytical and debugging skills, with a detail-oriented mindset
What you’ll find here?
A culture of innovation: We only take up projects that challenge us to innovate. Our customers come to us for our technology expertise.
Endless learning opportunities: Continuous learning is baked into our DNA. You’ll always have the chance to learn new things and stay on top of the latest trends.
Talented peers: Lead and Work alongside engineers from IITs, NITs, BITS, and other premier institutions.
Work-life balance: We value work-life balance and offer flexible schedules, including remote work options, so you can thrive both professionally and personally.
A great culture: Our employees love working here! 82% recommend Talentica to their friends, according to Glassdoor. Join us, and you’ll see why!
Recognition & rewards: We don’t just work hard, we celebrate success. Your contributions won’t go unnoticed. We’ll make sure you're recognized for the amazing work you do.
Ready to Make an Impact?
Fill out the lead form below, and we’ll get in touch!