Role Overview
We are looking for a skilled GCP Data Engineer to design, build, and maintain scalable data pipelines on Google Cloud Platform (GCP). The ideal candidate will have strong experience in data processing, cloud technologies, and modern data architecture.
Key Responsibilities
- Design, develop, and deploy batch and real-time data pipelines using GCP services
- Build and maintain data workflows using tools like Cloud Dataflow, Dataproc, and Composer (Airflow)
- Develop ETL/ELT processes for data ingestion, transformation, and loading
- Work with BigQuery for data warehousing and analytics
- Ensure data quality, integrity, and security across pipelines
- Optimize performance and cost of GCP resources
- Collaborate with data scientists, analysts, and stakeholders to understand data requirements
- Implement monitoring, logging, and alerting using Cloud Logging and Monitoring
- Automate infrastructure using Infrastructure as Code (Terraform/Deployment Manager)
Required Skills
- Strong experience with Google Cloud Platform (GCP) services:
- BigQuery
- Cloud Storage
- Dataflow
- Pub/Sub
- Dataproc
- Proficiency in programming languages:
- Python (preferred)
- SQL (advanced)
- Experience with ETL/ELT frameworks and data pipeline design
- Knowledge of data modeling concepts (star schema, normalization, etc.)
- Experience with workflow orchestration tools like Apache Airflow / Cloud Composer
- Familiarity with CI/CD pipelines and version control (Git)