Manager

LatentView Analytics -
Chennai, Tamil Nadu

Apply Now

Job details

Permanent
10 days ago

Qualifications

CI/CD
Cloud infrastructure
Cloud architecture
Kubernetes
DevOps
Encryption
Git
Bash (Unix shell)
AWS
Terraform
Continuous integration
Scripting
GitHub
Scalability
S3
RDS database
Jenkins
Python
Shell Scripting
Identity & access management

Full job description

Job Summary:

We are seeking a highly experienced Principal Cloud Infrastructure Engineer to lead the architecture, automation, and scalability of enterprise-grade cloud platforms. This role requires 10+ years of hands-on expertise in designing highly resilient AWS environments, building Infrastructure as Code (IaC) frameworks using Terraform, and managing large-scale Kubernetes ecosystems.

The ideal candidate will play a strategic role in strengthening platform reliability, cloud security, deployment automation, and operational excellence across the organization. You will work closely with engineering, platform, security, and architecture teams to establish scalable cloud-native solutions and drive infrastructure modernization initiatives.

Key Responsibilities

Cloud Infrastructure & Architecture

Architect, deploy, and manage highly available, scalable, and secure cloud infrastructure on AWS. Design enterprise-grade cloud environments leveraging services such as EKS, EC2, VPC, IAM, S3, RDS, Route53, CloudWatch, and Load Balancers. Drive cloud-native architecture standards and best practices for scalability, resiliency, and disaster recovery.

Infrastructure as Code (IaC)

Lead the implementation and governance of Infrastructure as Code using Terraform. Develop reusable Terraform modules, manage remote state strategies, and implement environment standardization using Terragrunt. Ensure infrastructure provisioning is automated, version-controlled, and compliant with enterprise standards.

Kubernetes & Container Platform Engineering

Design and manage production-grade Kubernetes (EKS/K8s) clusters for large-scale microservices platforms. Implement best practices for cluster scaling, workload orchestration, networking, ingress management, and security policies. Manage container deployment strategies using Helm, Service Mesh technologies (Istio), and Git Ops methodologies.

CI/CD & Platform Automation

Build and optimize automated CI/CD pipelines enabling zero-downtime deployments and faster release cycles. Implement Git Ops-based deployment strategies using tools such as Argo CD, Jenkins, and GitHub Actions. Automate operational processes, infrastructure provisioning, and platform maintenance tasks using Python and Bash scripting.

Observability, Reliability & Performance

Define and implement enterprise monitoring, alerting, logging, and observability frameworks. Ensure platform reliability through proactive monitoring using Prometheus, Grafana, ELK Stack, Datadog, or similar tools. Establish and maintain SLA/SLO-driven operational standards and incident response practices.

Security & Governance

Enforce security-first cloud infrastructure practices including IAM governance, least-privilege access, encryption, and network isolation. Conduct infrastructure security assessments, compliance reviews, and vulnerability remediation activities. Collaborate with security teams to implement enterprise compliance and governance standards.

Technical Leadership

Provide technical leadership and mentorship to DevOps, Cloud, and Infrastructure engineering teams. Lead architectural reviews, infrastructure modernization initiatives, and platform strategy discussions. Drive adoption of best practices across automation, reliability engineering, and cloud operations.

Technical Requirements
Primary Skills

AWS Cloud Architecture
Kubernetes (EKS/K8s)
Terraform & Infrastructure as Code (IaC)

Secondary Skills

Python / Bash Scripting
CI/CD Tools: GitHub Actions, Jenkins, ArgoCD
Helm & Service Mesh (Istio)
Monitoring & Observability: Prometheus, Grafana, ELK, Datadog

Required Experience

10+ years of experience in Cloud Infrastructure, DevOps Engineering, Platform Engineering, or Site Reliability Engineering (SRE) Strong experience managing enterprise-scale cloud-native environments and Kubernetes platforms Proven expertise in automation, infrastructure scalability, and cloud security best practices

Job Snapshot

Updated Date

25-05-2026

Job ID

J_5279

Location

Chennai, Tamil Nadu, India

Experience

10 - 12 Years

Employee Type

Permanent

Apply Now

Jobseeker tools

Employer Tools

Browse

Stay Connected