Job Description: Site Reliability Engineer (SRE)
Position: Site Reliability Engineer (SRE)
Experience Required: 5+ years
Location: First 2 months in Hyderabad, then Mumbai
Work Schedule: 24x7 support with 8 days off per month. Will follow a rotational roster schedule
Key Responsibilities:
* Provide 24x7 operational support to ensure the availability and reliability of critical systems and applications.
* Monitor, troubleshoot, and resolve issues across infrastructure and applications using EFK/ELK stacks.
* Develop, deploy, and maintain Java Spring Boot-based applications for enhanced system reliability.
* Perform database querying and optimization to support application performance.
* Collaborate with cross-functional teams to implement automation solutions for incident management and system reliability improvements.
* Create and maintain technical documentation for processes and troubleshooting guides.
Required Skills:
* Hands-on experience with EFK/ELK stack for logging, monitoring, and visualization.
* Proficiency in Java Spring Boot for application development and support.
* Strong experience in database querying and performance optimization.
* Sound knowledge of system administration, including Linux/Unix environments.
* Familiarity with cloud platforms and containerization (e.g., Kubernetes, Docker) is a plus.
Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in a similar role, with a strong focus on system reliability and operational support.
Other Details:
- Candidates must be willing to relocate to Mumbai after the initial 2 months in Hyderabad.
- Comfortable working in a 24x7 support environment with rotational shifts.
- Strong problem-solving skills and a proactive approach to incident resolution.