About the Role
Eightfold is building the next generation of Agentic AI products that help millions of people and enterprises make better talent decisions. We're looking for an AI Product Sustainability Engineer to help strengthen, scale, and evolve our existing AI-powered product lines used by enterprise customers globally.
This role focuses on the long-term sustainability, reliability, and evolution of AI-powered enterprise products operating at global scale. You will develop deep expertise in our products, architecture, customer deployments, and operational workflows to ensure our platforms remain scalable, resilient, and continuously improving.
You'll work closely with TM Engineering, Product, Customer Success, and AI/ML teams to solve complex production challenges, deliver customer-driven enhancements, improve operational health, and integrate new AI-native capabilities into existing products.
This is a hands-on software engineering role focused on owning systems in production and continuously improving them through strong engineering, product thinking, and operational excellence.
What You'll Do
-
Develop deep understanding of Eightfold's product architecture, customer workflows, and operational dependencies
-
Investigate, debug, and resolve complex production issues across frontend, backend, APIs, integrations, workflows, and distributed systems
-
Drive customer escalation resolution through deep technical investigation and cross-functional collaboration
-
Deliver customer enhancement requests and product improvements with speed and high quality
-
Build features and platform capabilities that improve robustness, scalability, usability, and customer experience
-
Improve observability, monitoring, diagnostics, automation, and overall operational health of production systems
-
Perform root-cause analysis and implement long-term fixes for recurring operational issues
-
Participate in incident response, release readiness, operational reviews, and production quality initiatives
-
Help operationalize newly launched AI and agentic capabilities into existing enterprise products
-
Contribute to engineering best practices around scalability, reliability, maintainability, deployment quality, and operational excellence
-
Build tooling and automation that improve engineering productivity and reduce operational overhead.
What We're Looking For
-
2+ years of professional software engineering experience
-
Strong experience building backend or full-stack applications
-
Proficiency in Python and at least one other language (Java, Go, or TypeScript)
-
Strong debugging and problem-solving skills in production environments
-
Experience working with APIs, distributed systems, cloud infrastructure, and backend services
-
Ability to troubleshoot complex issues across multiple system layers
-
Strong software engineering fundamentals including testing, debugging, performance optimization, and system design
-
Ability to balance rapid execution with long-term engineering quality and platform health
-
Strong ownership mindset with focus on reliability, operational excellence, and customer impact
-
Excellent communication and cross-functional collaboration skills
Nice to Haves
-
Experience with sustaining engineering, production engineering, or site reliability engineering
-
Familiarity with LLMs, agentic systems, orchestration frameworks, or workflow automation
-
Exposure to RAG architecture design and evaluation
-
Experience with observability and monitoring tools such as Grafana, Datadog, Splunk, or ELK
-
Experience supporting enterprise SaaS products at scale
-
Experience working in AI-first or fast-moving product organizations
Why Eightfold
-
Work on enterprise-scale AI-native products used globally by leading organizations
-
Solve complex real-world production and operational challenges
-
Own systems deeply and influence how products evolve after launch
-
Build scalable, reliable, and operationally excellent AI-powered systems
-
Work in a fast-moving environment where engineering ownership and execution matter
Tech Stack
-
Languages & Frameworks
-
Python, Flask, React, TypeScript, Webpack, NextJS
-
Data & Infrastructure
-
MySQL, Solr, Apache Airflow, Apache Spark, Docker
-
Cloud Services
-
AWS services including Aurora, S3, Redshift, CloudFormation, SNS, and SQS
-
AI & ML
-
Extensive use of ML models, LLMs, retrieval systems, and agentic AI workflows