Please find the Job Description (JD) attached for your reference.
Senior DevOps Engineer
Location: Remote (Must be available to work in CST Timezone)
Experience Required: 8+ Years
Department: Engineering
Employment Type: Full-Time
Role Overview
Ekshvaku Tech Innovations is looking for a highly experienced Senior DevOps Engineer who excels at building scalable, automated, and resilient infrastructure for modern distributed systems and AI-driven platforms.
You will lead DevOps strategy, modernize our infrastructure, and ensure high performance across multi-environment deployments. This role requires strong hands-on expertise, architectural thinking, excellent documentation discipline, and the ability to collaborate with cross-functional engineering teams.
Key Responsibilities
CI/CD & Automation
- Design, implement, and maintain scalable CI/CD pipelines for Prerel, QA, and Production.
-
Continuously improve build, release, and deployment workflows.
-
Reduce manual ops through automation and scripting.
Infrastructure Engineering
- Automate provisioning and configuration using Terraform, Ansible, and similar IaC tools.
-
Architect, deploy, and maintain cloud infrastructure (AWS/GCP).
-
Lead modernization initiatives: containerization, orchestration, microservices optimization.
-
Implement high availability setups, DR strategies, and cost-optimized infrastructure.
AI/ML Operations
- Support AI/ML model training and deployment pipelines.
-
Manage GPU orchestration and scalable model-serving environments.
-
Work closely with AI and Data Engineering teams.
Monitoring, Security & Observability
- Implement centralized monitoring, logging, and alerting systems.
-
Improve system reliability and incident response times.
-
Ensure cloud security best practices (IAM, network security, zero-trust).
-
Contribute to compliance efforts for HIPAA, SOC2, ISO 27001.
Documentation & Collaboration
- Maintain clear documentation, runbooks, and architecture diagrams.
-
Collaborate with Engineering, Product, QA, and AI teams.
-
Evaluate new DevOps tools and AI-powered automation capabilities.
Required Skills & Experience
- 8+ years in DevOps or Cloud Infrastructure roles.
-
Strong expertise in GitHub Actions and CI/CD pipeline architecture.
-
Deep understanding of AWS or GCP (VPC, IAM, security, autoscaling).
-
Production-level experience with Docker & Kubernetes (EKS/GKE).
-
Proficiency in scripting: Python, Bash, or Go.
-
Strong experience with Terraform, Ansible, CloudFormation.
-
Solid understanding of monitoring tools (Prometheus, Grafana, CloudWatch, ELK).
-
Experience supporting AI/ML pipelines (MLflow, Kubeflow, SageMaker, Vertex AI).
-
Knowledge of security best practices & compliance frameworks.
-
Excellent communication and documentation skills.
Preferred Qualifications
- Experience with GPU workloads and ML orchestration.
-
Knowledge of event-driven or serverless architectures (Lambda, Cloud Run).
-
Exposure to GitOps tools (Argo CD, Flux).
-
Background in AIOps (Dynatrace Davis, New Relic AI).
-
AWS DevOps Engineer Professional or equivalent certification.
Success Indicators
- 90%+ reduction in manual deployment tasks within 6 months.
-
Fully standardized infrastructure documentation across all environments.
-
Reduced downtime and faster incident resolution through observability.
-
Improved developer velocity and increased deployment frequency.
-
Adoption of AI-assisted automation and predictive monitoring.