Position Summary:
We are seeking a talented Data Engineer with a strong background in data engineering to join our team. You will play a key role in designing, building, and maintaining data pipelines using a variety of technologies, with a focus on the Microsoft Azure cloud platform.
Responsibilities:
-
Design, develop, and implement data pipelines using Azure Data Factory (ADF) or other orchestration tools.
-
Write efficient SQL queries to extract, transform, and load (ETL) data from various sources into Azure Synapse Analytics.
-
Utilize PySpark and Python for complex data processing tasks on large datasets within Azure Databricks.
-
Collaborate with data analysts to understand data requirements and ensure data quality.
-
Implement data governance practices to ensure data security and compliance.
-
Monitor and maintain data pipelines for optimal performance and troubleshoot any issues.
-
Develop and maintain unit tests for data pipeline code.
-
Work collaboratively with other engineers and data professionals in an Agile development environment.
Must Have tech skills
-
Databricks
-
Fabric
-
Sql
-
Pyspark
-
Python
Must Have competencies
-
Pipeline - Data Ingestion and data quality, transformation – Databricks, ADF
-
Consumption of rest api
-
Optimization of pyspark job, sql procedure, views
Nice to have (These are design and admin skills)
-
Warehouse and lake house creation
-
Modeling
-
Semantic model design
-
Refresh management
-
Logging and monitoring
-
Rls and permissions