Site Reliability Engineer (SRE)

INDUANSH PRIVATE LIMITED
Bengaluru, Karnataka

Quick apply

Job details

Full-time | Contractual / Temporary
₹80,000 - ₹1,00,000 a month
1 day ago

Qualifications

CI/CD
Cloud infrastructure
Azure
Go
Kubernetes
DevOps
AWS
Docker
Distributed systems
Terraform
Continuous integration
Scripting
Linux
Communication skills
Python

Full job description

Site Reliability Engineer (SRE)

Location: Bangalore
Experience: 8+ Years
Work Mode: Hybrid
Work Schedule: 7:30 AM – 5:00 PM
Joining Preference: Immediate Joiners Preferred

About the Role:

We are seeking an experienced Site Reliability Engineer (SRE) to drive reliability, scalability, and operational excellence across critical production platforms. The ideal candidate will have strong expertise in cloud infrastructure, Kubernetes, observability, automation, and incident management, with a focus on building highly available and resilient systems.

Key ResponsibilitiesService Reliability:

Define and manage SLIs, SLOs, and Error Budgets.
Monitor platform health and proactively address reliability risks.
Improve service availability, scalability, and performance.

Incident Management

Participate in and lead on-call support rotations.
Manage production incidents, troubleshooting, and service recovery.
Conduct root cause analysis and postmortem reviews.
Drive improvements in MTTD and MTTR metrics.

Automation & Infrastructure

Automate deployments, scaling, remediation, and operational tasks.
Implement Infrastructure as Code (IaC) practices.
Reduce manual operational effort through scripting and automation.

Observability & Monitoring

Build and maintain monitoring, logging, and distributed tracing solutions.
Support rapid issue diagnosis and performance analysis.
Enable proactive capacity planning and system optimization.

Performance & Resilience

Conduct load testing and capacity planning.
Implement failover mechanisms, canary deployments, and resilience strategies.
Support reliability-focused platform improvements and testing initiatives.

Required Skills:

Strong programming or scripting experience in Python, Go, or similar languages.
Advanced Linux administration and troubleshooting skills.
Strong understanding of networking concepts and distributed systems.
Hands-on experience with:
Kubernetes
Docker
Terraform
CI/CD Pipelines
Experience with cloud platforms:
AWS
Azure
GCP
Expertise in observability tools such as:
Prometheus
Grafana
ELK Stack
OpenTelemetry

Preferred Qualifications:

Experience implementing SLI/SLO/Error Budget frameworks.
Cloud certifications (AWS, Azure, or GCP).
Kubernetes or DevOps certifications.
Experience with Chaos Engineering and Resilience Testing.
Background in Platform Engineering, Production Operations, or Systems Engineering.

What We're Looking For:

Strong problem-solving and troubleshooting skills.
Experience leading production incident response.
Excellent communication and stakeholder management abilities.
Ability to work effectively in high-pressure production environments.
Passion for automation, reliability, and operational excellence.

Pay: ₹80,000.00 - ₹100,000.00 per month

Work Location: Hybrid remote in Bengaluru, Karnataka (Bengaluru Urban District)

Quick apply

Jobseeker tools

Employer Tools

Browse

Stay Connected