Sholinganallur, Tamil Nadu
Job Summary
Key Skills & Requirements
Strong hands-on experience in Grafana administration, including dashboard development, alert configuration, notification policies, RBAC, user management, and data source integration.
Expertise in Grafana plugin installation, configuration, troubleshooting, upgrades, and performance optimization across enterprise-scale monitoring environments.
Experience designing and maintaining observability solutions using Grafana Alloy, Grafana and OpenTelemetry frameworks.
Hands-on experience with Grafana Alloy configuration, telemetry collection pipelines, log/metric forwarding, relabeling, filtering, and performance tuning.
Strong knowledge of BindPlane administration, including collector deployment, gateway configuration, telemetry routing, load balancing, high availability, and troubleshooting.
Experience configuring and optimizing telemetry ingestion pipelines from on-premises and cloud-based infrastructure into centralized observability platforms.
Good understanding of Google Cloud Platform (GCP) services, with hands-on experience in GKE cluster administration, workload deployment, pod management, scaling, and troubleshooting.
Experience using Google Cloud Monitoring tools such as Metrics Explorer, Logs Explorer, dashboards, alerting policies, and observability best practices.
Strong Kubernetes administration skills, including deployments, services, ingress controllers, daemonsets, statefulsets, namespaces, resource management, and cluster troubleshooting.
Experience managing and monitoring Azure Kubernetes Service (AKS) environments and implementing observability solutions for containerized workloads.
Knowledge of Azure cloud services, networking concepts, identity management, and infrastructure monitoring.
Hands-on experience with Ansible for infrastructure automation, configuration management, deployment automation, and operational tasks.
Strong scripting and automation skills using Python and Shell Scripting for monitoring, API integrations, and operational efficiency improvements.
Experience integrating monitoring platforms with ServiceNow, REST APIs, webhook-based alerting, SQL , and third-party enterprise applications.
Strong understanding of Linux system administration, troubleshooting, process management, networking fundamentals, and performance analysis.
Ability to perform root cause analysis, capacity planning, performance optimization, and reliability improvements for large-scale monitoring platforms.
Experience supporting enterprise observability environments with thousands of monitored servers, applications, and cloud-native workloads.
Excellent analytical, troubleshooting, documentation, and stakeholder communication skills.
Cloud & Container Technologies
Google Cloud Platform (GCP)/Google Kubernetes Engine (GKE)
Kubernetes Administration
Azure Cloud/Azure Kubernetes Service (AKS)
Monitoring & Observability
Grafana
Grafana Alloy
OpenTelemetry
BindPlane
Cloud Monitoring
Log Management Solutions
Prometheus
Automation & Development
Python
Shell Scripting (Bash)
Ansible
REST APIs
Git/GitHub
Key Responsibilities
Key Skills & Requirements
Strong hands-on experience in Grafana administration, including dashboard development, alert configuration, notification policies, RBAC, user management, and data source integration.
Expertise in Grafana plugin installation, configuration, troubleshooting, upgrades, and performance optimization across enterprise-scale monitoring environments.
Experience designing and maintaining observability solutions using Grafana Alloy, Grafana and OpenTelemetry frameworks.
Hands-on experience with Grafana Alloy configuration, telemetry collection pipelines, log/metric forwarding, relabeling, filtering, and performance tuning.
Strong knowledge of BindPlane administration, including collector deployment, gateway configuration, telemetry routing, load balancing, high availability, and troubleshooting.
Experience configuring and optimizing telemetry ingestion pipelines from on-premises and cloud-based infrastructure into centralized observability platforms.
Good understanding of Google Cloud Platform (GCP) services, with hands-on experience in GKE cluster administration, workload deployment, pod management, scaling, and troubleshooting.
Experience using Google Cloud Monitoring tools such as Metrics Explorer, Logs Explorer, dashboards, alerting policies, and observability best practices.
Strong Kubernetes administration skills, including deployments, services, ingress controllers, daemonsets, statefulsets, namespaces, resource management, and cluster troubleshooting.
Experience managing and monitoring Azure Kubernetes Service (AKS) environments and implementing observability solutions for containerized workloads.
Knowledge of Azure cloud services, networking concepts, identity management, and infrastructure monitoring.
Hands-on experience with Ansible for infrastructure automation, configuration management, deployment automation, and operational tasks.
Strong scripting and automation skills using Python and Shell Scripting for monitoring, API integrations, and operational efficiency improvements.
Experience integrating monitoring platforms with ServiceNow, REST APIs, webhook-based alerting, SQL , and third-party enterprise applications.
Strong understanding of Linux system administration, troubleshooting, process management, networking fundamentals, and performance analysis.
Ability to perform root cause analysis, capacity planning, performance optimization, and reliability improvements for large-scale monitoring platforms.
Experience supporting enterprise observability environments with thousands of monitored servers, applications, and cloud-native workloads.
Excellent analytical, troubleshooting, documentation, and stakeholder communication skills.
Cloud & Container Technologies
Google Cloud Platform (GCP)/Google Kubernetes Engine (GKE)
Kubernetes Administration
Azure Cloud/Azure Kubernetes Service (AKS)
Monitoring & Observability
Grafana
Grafana Alloy
OpenTelemetry
BindPlane
Cloud Monitoring
Log Management Solutions
Prometheus
Automation & Development
Python
Shell Scripting (Bash)
Ansible
REST APIs
Git/GitHub
#body.unify div.unify-button-container .unify-apply-now: focus, #body.unify div.unify-button-container .unify-apply-#body.unify div.unify-button-container .unify-apply-now: focus, #body.unify div.unify-button-container .unify-apply-