Networks Operation Center NOC Engineer (Linux, Grafana, Docker)

Job details

Role & Responsibilities

Responsibilities

Monitor production infrastructure, platform health, and application uptime across 24x7 environments, ensuring SLA adherence and rapid incident response.
Detect, triage, and escalate incidents in real time, coordinating with engineering and DevOps teams to drive resolution within defined RTO and RCA timelines.
Manage and respond to alerts from monitoring tools (Grafana, PagerDuty, Datadog, or equivalent), distinguishing signal from noise and reducing MTTR.
Execute routine operational tasks including deployments, configuration changes, log analysis, and scheduled maintenance activities.Maintain and improve runbooks, escalation playbooks, and incident documentation to build institutional knowledge and reduce repeat issues.
Collaborate with the engineering team on observability improvements, including alert tuning, dashboard creation, and proactive capacity monitoring.
Participate in post-incident reviews and contribute to root cause analysis and preventive action planning.

Ideal Candidate

Strong NOC / Infrastructure Operations Engineer Profile with 24x7 monitoring and incident-response experience
Mandatory (Experience) – Must have 2+ years of experience in a NOC (Networks Operation Center) Engineer/Infrastructure operations/DevOps support/Product support role
Mandatory (Tech skill 1) – Must have a hands-on understanding of networking, Linux systems and cloud infrastructure
Mandatory (Tech skill 2) – Must have proficiency with monitoring and observability tools such as Grafana, Prometheus, Datadog, Graylog or similar
Mandatory (Tech skill 3) – Must have hands-on experience in real-time incident detection, triage and escalation, coordinating resolution within defined RTO/RCA timelines and reducing MTTR
Mandatory (Tech skill 4) – Must have working knowledge of containerization and orchestration technologies (Docker and Kubernetes preferred)
Mandatory (Tech skill 5) – Must be comfortable scripting in Bash, Python or similar for automation and operational tasks
Mandatory (Communication) – Must have strong communication skills, able to write clear incident updates and escalation notes under pressure
Mandatory (Note 1) – Must be comfortable with shift-based scheduling, including nights and weekends
Preferred (Education): Bachelors degree in Computer science
Preferred (DevOps) – Familiarity with CI/CD pipelines and DevOps workflows.
Preferred (Process) – Exposure to runbooks, escalation playbooks, post-incident reviews and root-cause analysis.

Pay: ₹217,595.91 - ₹1,200,000.00 per year

Application Question(s):

Experience:

Work Location: In person