Ciklum is looking for a Senior Automation QA Engineer to join our team full-time in India.
We are a custom product engineering company that supports both multinational organizations and scaling startups to solve their most complex business challenges. With a global team of over 4,000 highly skilled developers, consultants, analysts and product owners, we engineer technology that redefines industries and shapes the way people live.
About the role:
As a Senior Automation QA Engineer, become a part of a cross-functional development team engineering experiences of tomorrow.
You will own the test automation that spans the whole platform: the Frontend and Backend that agent developers depend on, the long-running Temporal workflows that must survive worker crashes and provider failures, the multi-tenant isolation boundaries that protect one customer's data from another, and the performance and concurrency targets defined in the platform's NFRs. Much of your craft is adversarial and failure-first injecting worker kills, retry storms, and malformed documents to verify the system recovers exactly as specified, and negative-testing disallowed tool calls and egress paths to confirm the sandbox holds. Where agent behavior must be judged rather than asserted, you will partner closely with the evaluation function, contributing the test infrastructure, synthetic-data pipelines, and CI integration that make quality measurable and repeatable.
This is a high-visibility role where we establish the production-ready foundation the rest of the program is built on. Your automated suites become release gates, and the evidence they produce feeds directly into client milestone sign-offs so your work must be reproducible, trustworthy, and defensible to a sophisticated client. As the platform scales toward an "agent-a-month" delivery cadence, you will build the reusable test scaffolds and harnesses that let quality keep pace with delivery, ensuring that month twelve ships as safely as month one. You will work shoulder-to-shoulder with AI engineers, platform engineers, and the evaluation team, embedding quality from the design stage rather than inspecting for it at the end.
Responsibilities:
-
Design, build, and maintain automated test suites for the platform, Frontend and Backend, including contract tests that protect the agent-developer experience as the SDK evolves
-
Build automated tests for long-running Temporal workflows including failure injection (worker kill, provider outage, retry storms) to verify durable-execution guarantees such as zero lost runs and correct compensation/saga behavior
-
Automate multi-tenant isolation verification cross-tenant data access, configuration bleed, and cost-attribution correctness executed per release as a platform acceptance criterion
-
Verify agent execution sandboxing and egress controls through negative testing of disallowed tool calls and network destinations
-
Design test strategies for LLM-driven behavior: statistical assertions over repeated runs (pass^k consistency), semantic-similarity scoring, confidence-threshold validation, and flakiness quarantine that distinguishes model variance from genuine regression
-
Build and continuously extend adversarial test corpora document-borne prompt injection in lease PDFs, tool-call hijack, and data-exfiltration probes aligned to OWASP LLM Top 10 and recognized red-team taxonomies
-
Verify Human-in-the-Loop (HITL) escalation behavior confidence thresholds fire correctly and no gated/irreversible action completes without authorization
-
Partner with the evaluation function on golden-dataset-driven checks, contributing test infrastructure, synthetic-data (synthetic lease) pipelines, and CI integration that make agent quality measurable and repeatable
-
Automate verification of platform NFRs invocation overhead (p95), concurrency targets, queue backpressure behavior, and LLM provider failover time and surface trends per release
-
Use tracing data (Langfuse / OpenTelemetry spans) as a first-class test asset: assert on trace completeness, token/cost-accounting accuracy, and alerting behavior for anomalous runs
-
Wire test suites into CI/CD as hard release gates with automated, per-metric regression flagging on pull requests
-
Produce auditable, reproducible test-evidence packs that feed directly into client milestone sign-off
-
Embed with AI and platform engineers from the design stage making acceptance criteria, testability, and observability inputs to architecture, not afterthoughts
-
Maintain test environments, mocked LLM/provider layers, and synthetic data generators that keep test runs fast, deterministic where possible, and affordable
-
Mentor mid-level QA engineers, codify AI testing patterns into reusable internal frameworks, and contribute to Ciklum's quality engineering practice and AI Academy
Requirements:
-
8+ years of test automation with strong Python pytest at depth, API testing, and building test frameworks other engineers adopt, not just test cases
-
Hands-on experience testing LLM-powered or other non-deterministic systems statistical assertions, semantic scoring, and managing model variance (this is the differentiator; cite real examples)
-
Experience testing asynchronous, event-driven systems failure injection, idempotency verification, eventual-consistency assertions; Temporal or a similar workflow engine a strong plus
-
Strong CI/CD integration skills (GitHub Actions or equivalent) and the discipline to design release gates
-
Solid SQL and data-validation skills; comfort building synthetic test-data pipelines and measuring dataset quality
-
A security-testing and adversarial mindset; familiarity with prompt-injection techniques or a strong AppSec testing background with willingness to specialize fast
-
Demonstrated judgment on what "tested enough" means under milestone pressure, and the ability to defend quality evidence to non-engineering stakeholders
-
QA Automation Core: Extensive experience in building automated test frameworks (Python-based) and integrating them into high-security CI/CD pipelines
-
AI/LLM Literacy: Deep understanding of Large Language Model behaviors, prompt sensitivity, and the common failure modes of RAG (Retrieval-Augmented Generation) systems
-
Analytical Mindset: Strong capability in statistical analysis to interpret evaluation scores and provide actionable feedback to AI Engineers for prompt tuning
Desirable:
-
Performance/load tooling (k6, Locust) for concurrency and latency NFR verification
-
Prior testing of multi-tenant SaaS isolation
-
Exposure to eval/observability stacks (Langfuse, LangSmith, Promptfoo, DeepEval, Arize Phoenix)
-
Compliance evidence experience (SOC 2, GDPR) or accessibility testing (Section 508 / WCAG)
-
Familiarity with Commercial Real Estate workflows (lease accounting, CAM reconciliation) or document-extraction systems (OCR, layout parsing)
-
Experience using LLMs to generate test cases, adversarial payloads, or synthetic documents
-
Performance Testing: Experience testing high-concurrency background processing within agentic systems
What`s in it for you?
-
Strong community: Work alongside top professionals in a friendly, open-door environment
-
Growth focus: Take on large-scale projects with a global impact and expand your expertise
-
Tailored learning: Boost your skills with internal events (meetups, conferences, workshops), Udemy access, language courses, and company-paid certifications
-
Endless opportunities: Explore diverse domains through internal mobility, finding the best fit to gain hands-on experience with cutting-edge technologies
-
Care: We’ve got you covered with company-paid medical insurance, mental health support, and financial & legal consultations
About us:
At Ciklum, we are always exploring innovations, empowering each other to achieve more, and engineering solutions that matter. With us, you’ll work with cutting-edge technologies, contribute to impactful projects, and be part of a One Team culture that values collaboration and progress.
India is a strategic innovation hub for Ciklum, with growing teams in Chennai and Pune leading advancements in EdgeTech, AR/VR, IoT, and beyond. Join us to collaborate on game-changing solutions and take your career to the next level.
Explore, empower, engineer with Ciklum!
Interested already? We would love to get to know you! Submit your application. We can’t wait to see you at Ciklum.
#LI-MK2