Specialist - Cloud Engineer

LTM -
Bengaluru, Karnataka

Quick apply

Job details

Internship

Qualifications

Cloud infrastructure
System administration
Azure
Operating systems
Kubernetes
Ansible
Data migration
Load balancing
Encryption
Google Cloud Platform
Windows
Bash (Unix shell)
AWS
Docker
Terraform
Splunk
Scripting
Computer networking
Communication skills
Python
AWS Certified DevOps Engineer – Professional
AWS Certified Solutions Architect – Professional
Identity & access management

Full job description

Role description

Job Description

Job Title Cloud Site Reliability Engineer SRE

Position Overview

We are seeking a Cloud Site Reliability Engineer SRE to drive the reliability scalability and performance of our cloudbased infrastructure The ideal candidate combines software engineering expertise with advanced systems operations skills to maintain highly available systems while reducing operational toil This role involves automation monitoring capacity planning incident response and cloud platform management across a dynamic distributed environment

As a Cloud SRE you will work closely with Engineering Architecture DevOps and security teams to ensure seamless service experiences for our customers while contributing to platform design and operational efficiency

Position Requirements

Our Engineers play a citical role in the success of our clients and are expected to effectively communicate our recommended solutions in a consultative role for each client Therefore a successful candidate will possess a high degree of selfmanagement personal accountability strong communication skills and teamwork The ability to interact engineer and communicate collaboratively at the highest technical levels with customers vendors partners and all members of staff is required

Key Responsibilities

System Reliability Availability Design and maintain faulttolerant highavailability architectures across AWS Azure and GCP Implement redundancy load balancing and automated failover strategies

Cloud Infrastructure Management Deploy manage and optimize cloud resources using IaC tools such as Terraform Ansible

Monitoring Observability Implement monitoring ing and logging frameworks using Splunk Azure monitor Dynatrace AWS cloud watch or similar to detect and resolve issues proactively

Incident Management Lead realtime incident response rootcause analysis and postmortems to continuously improve uptime and resilience

Capacity Planning Scaling Predict traffic patterns optimize resource utilization and enforce autoscaling and performance best practices

Automation Tooling Develop scripts and internal tooling for automating routine tasks to reduce manual intervention Languages may include Python Power Shell or Bash

Security Compliance Collaborate with security teams to implement secure infrastructure practices including encryption rolebased access auditing and vulnerability management

Collaboration Mentorship Work across engineering and DevOps teams providing guidance on reliability best practices and mentoring junior SREs

Required Skills Qualifications

Programming Scripting Proficiency in Python Power Shell Bash or equivalent for automation and system management

Cloud Platforms Handson experience with AWS Azure or GCP strong understanding of VPCs IAM serverless architectures and managed Kubernetes services

Containers Orchestration Experience with Docker and Kubernetes

Infrastructure as Code IaC Proficient in Terraform Ansible

Monitoring Observability Expertise with Splunk Azure Monitor Dynatrace AWS Cloud Watch or similar tools

Expert Knowledge and practical experience using Cloud data migration tools

Operating Systems Advanced knowledge of Windows LinuxUnix environments with experience in system administration and networking fundamentals

Incident Response Strong problemsolving skills under pressure with experience managing outages and mitigating risk

Collaboration Communication Ability to articulate technical insights coordinate across teams and contribute to a blameless culture to resolve issues and drive consistent results

Preferred Qualifications

Industry certifications such as AWS Certified Solutions Architect Google Cloud Professional DevOps Engineer Azure Dev Ops Engineer

Exposure to chaos engineering or resilience testing frameworks

Prior experience in multicloud deployments or hybrid cloud environments

Familiarity with servicelevel objectives SLOs indicators SLIs and error budgets for service reliability

Gather feedback from the department on areas of improvement and provide solutions utilizing Azure

Skills

Mandatory Skills : AWS Automation Services

About LTM
LTM is an AI-centric global technology services company and the Business Creativity partner to the world’s largest and most disruptive enterprises. We bring human insights and intelligent systems together to help clients create greater value at the intersection of technology and domain expertise. Our capabilities span integrated operations, transformation, and business AI — enabling new ways of working, new productivity paradigms, and new roads to value. Together with over 87,000 employees across 40 countries and our global network of partners, LTM — a Larsen & Toubro company — owns business outcomes for our clients, helping them not just outperform the market, but to Outcreate it. Please also note that neither LTM nor any of its authorized recruitment agencies/partners charge any candidate registration fee or any other fees from talent (candidates) towards appearing for an interview or securing employment/internship. Candidates shall be solely responsible for verifying the credentials of any agency/consultant that claims to be working with LTM for recruitment. Please note that anyone who relies on the representations made by fraudulent employment agencies does so at their own risk, and LTM disclaims any liability in case of loss or damage suffered as a consequence of the same. Recruitment Fraud Alert - https://www.ltimindtree.com/recruitment-fraud-alert/

Quick apply

Jobseeker tools

Employer Tools

Browse

Stay Connected