We need a Senior Data Engineer with 10+ years exp proficient in Spark, Scala/Java, and Hive, with extensive hands-on development experience in the Big Data Ecosystem.
Key Responsibilities:
- Design, implement, and optimize highly performant data pipelines using Spark, Scala/Java, and Hive on platforms like Cloudera Data Platform (CDP) or other Hadoop echo systems.
- Take complete ownership of complex data engineering projects within the big data ecosystem, covering the entire lifecycle from initial design and development to deployment and ongoing maintenance.
- Develop robust and efficient Hive queries for extensive data analysis and reporting.
- Champion and enforce best practices and coding standards for new and existing data flows to ensure they are robust, scalable, secure, and maintainable using Spark, Scala/Java, and Hive within the big data ecosystem.
- Diagnose, troubleshoot, and resolve complex issues related to Spark, Scala/Java, and Hive applications and YARN resource management, implementing performance optimization solutions.
Proactively collaborate with stakeholders, working closely to develop solutions with full commitment and accountability.
-
Technical Skills & Experience :
- Proven hands-on development expertise with Apache Spark
- Strong programming proficiency in Scala and/or Java
- In-depth knowledge and practical experience with Hive, including query optimization and data analysis.
Experience with data platforms such as Cloudera Data Platform (CDP) is highly desirable.
-
Education:
- Bachelor’s / Master's degree/University degree or equivalent experience