Senior AWS Glue Developer (PySpark) - Alight

Relevance Labs -
India

Quick apply

Job details

Full-time

Qualifications

CI/CD
Big data
Git
SQL
AWS
Machine learning
Distributed systems
Continuous integration
GitHub
ETL
Agile
S3
Kafka
Redshift
DynamoDB
AI
Informatica
Communication skills

Full job description

Role Overview

We are seeking a highly skilled Senior AWS Glue Developer with deep expertise in PySpark, distributed data processing, and cloud-native ETL pipeline development. The ideal candidate will design, build, optimize, and maintain large-scale data ingestion and transformation pipelines on AWS, contributing to our enterprise data platform modernization and analytics initiatives.

Key Responsibilities

1. ETL Development & Data Engineering

Design, develop, and optimize AWS Glue ETL jobs using PySpark, Glue Studio, and Glue Workflows.
Build scalable batch and near–real-time ingestion pipelines using Glue, Lambda, and Step Functions.
Transform data for analytical, reporting, machine learning, and Lakehouse use cases.

2. Data Lake / Lakehouse Architecture

Develop pipelines targeting Amazon S3 Data Lake, Iceberg,
Implement robust data quality, metadata, and governance layers (Glue Catalog, Lake Formation).
Optimize storage using Parquet, compression, and columnar formats.

3. Performance Optimization

Tune PySpark jobs for high performance (memory management, partition pruning, shuffle optimization).
Optimize Glue job parameters (worker type, DPUs, job bookmarks, concurrency).

4. CI/CD & DevOps Integration

Build automated deployments using GitHub Actions.

5. Cross-functional Collaboration

Partner with Architects, Leads, Developers and Business teams to refine requirements.
Translate functional specifications into technical ETL and orchestration solutions.

Required Skills & Experience

Core Technical Skills

8–12+ years’ experience as a Sr Data Engineer.
Strong expertise in PySpark and distributed computing.
Hands-on experience with:
- AWS Glue (Jobs, Workflows, Triggers, Crawlers)
- AWS Lambda
- AWS Step Functions
- Amazon S3
- AWS Athena
- Glue Catalog / Lake Formation
- Redshift
- DynamoDB
Advanced SQL and optimization for big data workloads.

Big Data & Cloud

Experience with Kafka,Flink (nice-to-have).
Strong knowledge of ETL patterns, CDC frameworks, and event-driven pipelines.
Understanding of Medallion architecture and Lakehouse principles.

Soft Skills

Strong communication and documentation abilities.
Ability to lead development tasks and mentor junior engineers.
Ability to work in an agile, fast-paced environment.

Preferred Qualifications

Experience with Iceberg /Redshift/Sqlserver/DynamoDB.
Experience with Informatica cloud (nice-to-have).
Exploring AI Tools for data integrations

Quick apply

Jobseeker tools

Employer Tools

Browse

Stay Connected