Design, develop, and maintain robust and scalable ETL pipelines.
Work with large datasets using big data technologies and frameworks.
Build and optimize data workflows using PySpark and DBT.
Develop and manage data models to support analytics and reporting.
Collaborate with data analysts, data scientists, and cross-functional teams to understand data requirements.
Ensure data quality, integrity, and performance across systems.
Deploy and manage data solutions on AWS cloud services.
Monitor and troubleshoot data pipelines and workflows.
Strong experience in Data Engineering concepts and practices.
Hands-on experience with Big Data technologies.
Proficiency in AWS services (e.g., S3, Glue, Redshift, EMR, Lambda).
Experience with DBT (Data Build Tool) for data transformation.
Strong programming skills in PySpark.
Solid understanding of ETL processes and data pipeline architecture.
Experience working with relational and non-relational databases.
Experience with workflow orchestration tools (e.g., Airflow).
Knowledge of data warehousing concepts and data modeling.
Familiarity with CI/CD practices in data engineering.
Understanding of data governance and security best practices.