Candidate Skill:
Linux, Shell Scripting, Python, MySQL, Oracle, Monitoring (Prometheus, Grafana, Splunk), AWS, Azure, CI/CD, Git, Jenkins, ServiceNow, Troubleshooting, RCA, Automation, Production Support.
Job Description:
We are looking for an experienced Ops Support L3 Engineer to manage and support production systems, ensuring high availability, reliability, and performance of business-critical applications. The candidate should possess strong analytical, troubleshooting, and automation skills with a deep understanding of system operations, incident management, and production support processes. Key Responsibilities: Provide Level 3 (L3) operational support for critical production systems and applications. Investigate, troubleshoot, and resolve complex incidents, service requests, and performance issues. Collaborate with development, QA, and DevOps teams to identify root causes and implement fixes. Monitor system performance, automate repetitive tasks, and optimize production processes. Maintain documentation, standard operating procedures (SOPs), and knowledge base articles. Participate in on-call rotation and ensure timely escalation of critical issues. Implement and maintain monitoring, alerting, and reporting tools. Contribute to continuous improvement initiatives for operational efficiency and system stability. Technical Skills: Operating Systems: Linux, Unix, Windows Server. Scripting: Shell, Python, PowerShell. Databases: MySQL, Oracle, PostgreSQL, MongoDB (for basic troubleshooting and query tuning). Monitoring Tools: Prometheus, Grafana, Nagios, Splunk, ELK Stack. Cloud Platforms: AWS, Azure, or GCP (basic deployment and monitoring). Version Control & CI/CD: Git, Jenkins. Incident & Change Management: ServiceNow, Jira, or similar tools. Networking: Basic understanding of DNS, load balancing, VPN, and firewalls. Automation: Ansible, Terraform (preferred). Preferred Qualifications: Experience in production support for large-scale enterprise systems. Exposure to DevOps tools and practices. Strong analytical and root cause analysis (RCA) skills. Ability to work in a 24x7 support environment if required. Excellent communication and documentation skills. Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field. Skills (One Line): Linux, Shell Scripting, Python, MySQL, Oracle, Monitoring (Prometheus, Grafana, Splunk), AWS, Azure, CI/CD, Git, Jenkins, ServiceNow, Troubleshooting, RCA, Automation, Production Support.