Role : Data Engineer
Work Locations: Pune, Mumbai, Chennai, Bangalore
CTC: Up to 19 LPA
Experience: 6–10 Years
Domain: Cards & Payments
Background Verification: Pre-Onboarding
Job Summary
We are seeking a highly skilled Senior Data Engineer with strong expertise in PySpark, Apache Spark, Databricks, Airflow, and Cloud Data Platforms. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing pipelines, building robust ETL frameworks, and supporting data engineering initiatives in the Cards & Payments domain.
The candidate should have hands-on experience working with distributed data processing systems, cloud-based data lakes, and workflow orchestration tools while ensuring scalability, performance, and reliability of data solutions.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using PySpark and Apache Spark.
- Build and optimize ETL/ELT workflows for processing large-scale datasets.
- Develop and manage data solutions using Databricks, including Jobs, Workflows, and Delta Lake.
- Create and maintain workflow orchestration processes using Apache Airflow.
- Design and optimize complex SQL queries, stored procedures, and data warehouse solutions.
- Develop and support cloud-based data lake architectures using Amazon S3 or equivalent cloud storage platforms.
- Deploy and manage Spark workloads on EMR Serverless or other managed Spark environments.
- Implement data quality checks, monitoring, and performance tuning across data pipelines.
- Work closely with business stakeholders, analysts, and cross-functional teams to understand data requirements and deliver solutions.
- Ensure adherence to data governance, security, and compliance standards.
- Support real-time and batch processing use cases within enterprise data platforms.
Mandatory Skills
- Strong hands-on experience with PySpark and Apache Spark internals.
- Experience working with Databricks, including Jobs, Workflows, and Delta Lake.
- Strong knowledge of Apache Airflow for workflow orchestration.
- Advanced SQL skills and experience with Data Warehouse (DWH) systems.
- Experience running Spark workloads on EMR Serverless or managed Spark platforms.
- Hands-on experience with cloud data lakes using Amazon S3 or equivalent storage solutions.
- Strong programming skills in Python.
- Experience with Big Data technologies and distributed computing frameworks.
Desired Skills
- Exposure to streaming frameworks such as:
- Spark Structured Streaming
- Apache Kafka
- Experience with performance tuning and optimization of Spark applications.
- Knowledge of CI/CD practices for data engineering deployments.
- Familiarity with cloud-native data architectures and modern data platforms.
Preferred Experience
- 6–10 years of overall IT experience.
- Experience in the Cards & Payments domain is preferred.
- Strong understanding of data modeling, ETL frameworks, and data lake architectures.
- Excellent problem-solving, analytical, and communication skills.
Education
- Bachelor's or Master's degree in Computer Science, Information Technology, Engineering, or a related field.
Pay: Up to ₹1,900,000.00 per year
Application Question(s):
- What is your total years of experience?
- How many years of experience do you have with PySpark and Apache Spark?
- What is your current CTC and expected CTC?
- What is your notice period? Are you an immediate joiner?
- Are you comfortable working from Pune, Mumbai, Chennai, or Bangalore in a hybrid work model?
- Have you worked with EMR Serverless or any managed Spark platform?
- How strong are your SQL skills? Have you worked with Data Warehousing solutions?
- Can you explain your experience with Apache Airflow?
Work Location: In person