Job Description: Experience working with cloud platforms such as AWS, GCP, or Azure
Responsibilities: Design, build, and maintain scalable data pipelines and ETL/ELT workflows to ingest, transform, and process large volumes of structured and semi-structured data.
- Develop and optimize data models, tables, and transformations to support analytics, reporting, and downstream data consumption.
- Work with large datasets using SQL, PySpark, and modern data platforms such as Snowflake and Databricks to ensure efficient data processing.
- Build and manage data workflows using orchestration tools such as Apache Airflow, ensuring reliable and timely data delivery.
- Develop automation scripts using Shell Scripting and Python to support data pipeline execution, monitoring, and operational efficiency.
- Monitor, troubleshoot, and optimize data pipelines to improve performance, scalability, and reliability across the data ecosystem.
- Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and enable data-driven insights
- Ensure adherence to data engineering best practices, including data quality checks, documentation, and pipeline governance.
Qualifications: Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or a related field.