Responsibilities
Direct Responsibilities
- Design & develop robust ingestion, transformation, and enrichment pipelines with Python, PySpark, and SQL.
-
Write and optimize complex SQL queries, analytical UDFs, and window functions for data aggregation and reporting.
-
Collaborate with CEFS data architects, data scientists, and business analysts to translate functional requirements into technical specifications.
-
Unit‑test, integrate‑test, and review code.
-
Maintain CI/CD pipelines (Git, Jenkins, Docker) for automated build, test, and deployment of jobs.
-
Monitor production workloads and troubleshoot performance bottlenecks, memory issues, and job failures.
-
Document data lineage, pipeline design, and operational run‑books in Confluence/SharePoint.
- Keep up to date with latest technologies and trends and provide input, expertise and recommendations.
Contributing Responsibilities
- Contribute towards innovation (e.g. AI/ML); suggest new technical practices for efficiency improvement.
- Participate in Agile ceremonies (sprint planning, daily standups, retrospectives) and help groom the backlogs.
-
Mentor junior engineers and champion best practices in Python coding, Spark optimization, and data‑engineering patterns.
-
Evaluate emerging technologies and deliver proof‑of‑concepts for CEFS.
Technical & Behavioral Competencies
- Resourceful to quickly understand complexities involved and provide the way forward.
- Good experience in technical analysis of n-tier applications with multiple integrations using object oriented, APIs & Microservices approaches.
- Strong knowledge about design patterns and development principles.
- Inclination and prior experience of working across SQL, Python and ETL.
- Strong Hands-on experience - Python (NumPy, pandas, Python Frameworks, Restful APIs, MS-SQL or Oracle.
- PySpark - DataFrames, Spark SQL, Structured Streaming, performance tuning (partitioning, caching, broadcast joins).
- Advanced SQL – complex queries, stored procedures, query optimization.
- Good Knowledge and experience to use Python packages such as Pandas, NumPy, etc. for cleaning up of Data, Data Wrangling, Analysis of Data and Visualization of Data.
- Good experience in development and maintenance of code/scripts in both functional and technical specifications of all applications component, bug fixing and production support.
- Good knowledge on Linux/Unix environment (basic commands, shell scripting, etc.), testing phases, documentation and new framework.
- Some experience of working with build tools like Maven & DevOps tools like Bitbucket, Jenkins.
- Knowledge of Agile, Scrum, DevOps.
- Development experience in Data Engineering environment.
- Ability & willingness to learn & work on diverse technologies (languages, frameworks, and tools).
- Self-motivated, good interpersonal skills and inclination to constantly upgrade new technologies and frameworks.
- Good communication and co-ordination skills.
Specific Qualifications:
- Good to have knowledge of front-end technologies preferably Flask.
Skills Referential (Required knowledge, skills and abilities)
Technical Skills:
Behavioral Skills:
- Ability to synthetize / simplify
- Ability to collaborate / Teamwork
- Attention to detail / rigor
- Ability to deliver / Results driven
Education Level: Bachelor’s degree or equivalent
Location: Chennai