We are looking for skilled Hadoop Ecosystem Support Engineers to provide operational support and ensure the stability, performance, and availability of big data platforms. The ideal candidate will have hands-on experience managing and troubleshooting the Hadoop ecosystem — including HDFS, Hive, Spark, YARN, and other related components.
This role focuses on providing support, maintenance, and resolving issues.
Key Responsibilities
- Provide L3 production support for Hadoop ecosystem components (HDFS, Hive, Spark, YARN, Oozie, etc.).
- Monitor cluster health, performance, and resource utilization using tools such as Ambari, Cloudera Manager, or Grafana.
- Troubleshoot and resolve HDFS, Hive, and Spark job failures and performance issues.
- Perform root cause analysis (RCA) for recurring incidents and work with engineering teams to implement fixes.
- Manage user access, quotas, and security policies in Hadoop clusters.
- Conduct routine maintenance tasks such as service restarts, cluster upgrades, and patch management.
- Collaborate with data engineers and platform teams to ensure optimal cluster performance and reliability.
- Document support procedures, incident reports, and configuration changes.
Required Skills & Experience
3–8 years of experience supporting or administering Hadoop ecosystems in production.
Strong hands-on knowledge of:
HDFS (file system management, data balancing, recovery)
Hive (query execution, metastore management, troubleshooting)
Spark (job monitoring, debugging, performance tuning)
YARN, Oozie, and Zookeeper
Experience with cluster management tools like Ambari, Cloudera Manager, or similar.
Proficiency in Linux/Unix system administration and shell scripting.
Strong analytical and problem-solving skills with a focus on incident management and RCA.
Familiarity with Kerberos, Ranger, or other security frameworks within Hadoop.
Education
Regular MCA or Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent experience.
Nice to Have
- Exposure to cloud-based big data platforms (AWS EMR, Azure HDInsight, GCP Dataproc).
- Basic understanding of Python or Scala for log analysis and automation.
- Experience with Kafka, Airflow, or other data orchestration tools.
- Knowledge of ticketing systems (ServiceNow, JIRA) and ITIL processes.