Strong hands-on experience in Python and PySpark for data processing and data engineering activities.
Experience in developing data solutions using PySpark, Spark SQL, and related frameworks/libraries.
Hands-on experience in building and maintaining ETL/ELT pipelines, data ingestion pipelines, and data transformation processes.
Experience in ingesting data from multiple sources such as databases, files, cloud storage, APIs, S3/data lake platforms, or similar.
Experience working with structured and unstructured data.
Good understanding of data warehouse concepts, data lake concepts, and data processing patterns.
Ability to develop scalable, reusable, and maintainable data processing components.
Experience in end-to-end data pipeline development, including source ingestion, transformation, validation, and target load.
Good knowledge of SQL for data analysis, transformation, validation, and basic performance tuning.
Ability to write clean, efficient, reusable, and scalable Python code.
Good understanding of data quality checks, testing, monitoring, documentation, and production support practices.
Awareness of security and data protection principles in data engineering solutions.