XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling people and organizations to gain real-time and intelligent business insights.
We deliver innovation through:
Agentic Systems for AI Agents akira.ai
Vision AI Platform xenonstack.ai
Inference AI Infrastructure for Agentic Systems nexastack.ai
Our mission is to accelerate the world’s transition to AI + Human Intelligence by making AI agents enterprise-ready, reliable, and production-grade.
We are seeking an AgentOps Engineer to deploy, monitor, and optimize agentic AI systems in production environments.
This role is at the core of AI observability and operational reliability, ensuring that multi-agent workflows perform consistently, safely, and efficiently. You’ll work at the intersection of MLOps, DevOps, and Agentic AI, enabling enterprises to confidently adopt AI agents at scale.
Agent Deployment & Operations
Deploy and maintain LLM-powered and multi-agent systems across cloud and on-prem environments.
Integrate agents with enterprise APIs, knowledge bases, and third-party systems.
Monitoring & Observability
Implement agentic observability frameworks to track performance, latency, cost, and accuracy.
Monitor execution traces, context windows, and agent interactions for anomalies.
Optimization & Reliability
Fine-tune agent configurations for cost efficiency, scalability, and response quality.
Implement fallbacks, guardrails, and redundancy mechanisms to ensure reliability.
Evaluation & Feedback Loops
Build automated pipelines to evaluate agent performance, safety, and compliance.
Feed test results into continuous improvement loops with ML/AI engineers.
Security & Compliance
Ensure agents follow enterprise security, governance, and compliance standards.
Collaborate with Responsible AI teams to implement trust, safety, and audit mechanisms.
Cross-Functional Collaboration
Work with AI engineers, DevOps, and product managers to align operational reliability with business outcomes.
Support customer deployments with customized monitoring and tuning.
Must-Have
2–5 years of experience in DevOps, MLOps, or AI systems engineering.
Hands-on with LLM orchestration frameworks (LangChain, LangGraph, LlamaIndex).
Familiarity with AgentOps tools (LangSmith, PromptLayer, Weights & Biases, Arize AI).
Proficiency in Python and scripting for automation.
Experience with cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
Knowledge of monitoring & observability tools (Prometheus, Grafana, ELK, OpenTelemetry).
Understanding of RAG pipelines, vector databases, and context orchestration.
Good-to-Have
Exposure to multi-agent orchestration (MCP, A2A messaging, AgentBridge).
Experience in Responsible AI / model evaluation frameworks.
Familiarity with CI/CD pipelines for AI models and agents.
Background in BFSI, GRC, SOC, or enterprise SaaS systems.
Agentic AI Product Company
Work on next-gen AI platforms where agent reliability and observability define enterprise adoption.
A Fast-Growing Category Leader
Join one of the fastest-growing AI Foundries, powering mission-critical AI agents for global enterprises.
Career Mobility & Growth
Grow into roles such as AgentOps Lead, Reliability Engineer, or AI Systems Architect.
Global Exposure
Manage enterprise-scale AgentOps deployments across regulated industries worldwide.
Create Real Impact
Ensure that AI agents in production deliver measurable business outcomes.
Culture of Excellence
Our values — Agency, Taste, Ownership, Mastery, Impatience, and Customer Obsession — empower you to build, innovate, and own outcomes.
Responsible AI First
Contribute to trustworthy, explainable, and compliant AI agents that enterprises can rely on.
At XenonStack, we believe in shaping the future of intelligent systems. We foster a culture of cultivation built on bold, human-centric leadership principles, where deep work, simplicity, and adoption define everything we do.
Our Cultural Values
Agency – Be self-directed and proactive.
Taste – Sweat the details and build with precision.
Ownership – Take responsibility for outcomes.
Mastery – Commit to continuous learning and growth.
Impatience – Move fast and embrace progress.
Customer Obsession – Always put the customer first.
Our Product Philosophy
Obsessed with Adoption – Making AI agents reliable and enterprise-ready.
Obsessed with Simplicity – Turning complex agent operations into seamless, intuitive workflows.
Be a part of our mission to accelerate the world’s transition to AI + Human Intelligence — by ensuring AI agents are reliable, secure, and production-ready.