Job description:
Job Title: DevOps Engineer
About the Role
We are launching a cutting-edge AI training initiative that leverages real-world engineering expertise to build next-generation intelligent systems. As a DevOps Engineer, you will play a key role in designing, documenting, and translating real-life DevOps and backend incidents into high-quality benchmark scenarios. Your experience with outages, scalability challenges, secure deployments, and distributed systems will directly contribute to training advanced AI models.
Key Responsibilities
- Capture and document real-world DevOps and backend engineering incidents, converting them into structured benchmark tasks
- Design and contribute to secure, scalable, Kubernetes-native architectures
- Build and optimize CI/CD pipelines and infrastructure-as-code workflows
- Implement and maintain observability, monitoring, and alerting systems
- Work across identity and access management (IAM), infrastructure security, and backend services
- Collaborate with engineering teams to build realistic, production-grade workflows
- Contribute to best practices for secure and isolated (air-gapped/offline) environments
Key Requirements (Must Have)
- 5–15 years of hands-on DevOps / Platform Engineering experience
- Strong Kubernetes expertise (production-scale deployments, troubleshooting, scaling, security)
- Terraform OSS and Infrastructure as Code (IaC) experience
- CI/CD pipeline implementation (preferably GitLab CI)
- Docker and containerized environments
- Multi-cloud experience across AWS, Azure, and/or GCP
- Proficiency in Go, Python, or Java
- Experience designing and operating distributed systems
- Observability and monitoring expertise (Prometheus, Grafana, logging, alerting)
- IAM and infrastructure security experience, including Keycloak or similar identity platforms
- Client-facing experience with a proven track record of clearing customer/client technical interviews
- Strong communication and stakeholder management skills
- Ability to document and translate real-world production incidents into structured engineering scenarios
Preferred (Good to Have)
- Experience with air-gapped/offline deployment environments
- Exposure to Data Engineering platforms and pipelines (Airflow, Kafka, Spark, etc.)
- MLOps / AI infrastructure experience
- gRPC and Kubernetes-native application development
- Experience working on large-scale platform modernization or cloud migration initiatives
Additional RequirementsCandidates must demonstrate hands-on experience across at least five functional areas/tools, including:
- Identity & Access Management (IAM)
- Observability (Prometheus, Grafana)
- CI/CD Pipelines
- Keycloak
- GitLab CI
- Terraform OSS
- Kubernetes ecosystem tools
- ML / Distributed pipelines (nice to have)
What We Offer
- Opportunity to work on next-generation AI systems
- Exposure to complex, real-world engineering challenges
- Collaborative and high-performance work environment
- Career growth in advanced DevOps and AI-driven platforms
Job Type: Full-time
Pay: ₹50,000.00 - ₹100,000.00 per month
Benefits:
Work Location: In person