- We are looking for an experienced Data Reliability Engineer DRE to ensure the reliability availability performance and quality of enterprise data platforms and pipelines
- The role focuses on applying Site Reliability Engineering SRE principles to data systems enabling stable and scalable analytics reporting and data science workloads
- Data Platform Reliability
- Ensure high availability performance and reliability of data pipelines and platforms
- Monitor and maintain SLAs SLOs and SLIs for data systems
- Reduce data downtime failures and data quality incidents
- Implement proactive monitoring and alerting for data pipelines
- Observability Monitoring
- Build and maintain monitoring logging and alerting for data workflows
- Track pipeline success failure rates latency freshness and volume
- Implement observability for batch and streaming data systems
- 5 8 years of experience in Data Engineering Platform Engineering or SRE roles
- Strong understanding of data pipeline reliability concepts
- Hands on experience with modern data stacks such as
- Cloud data warehouses Snowflake BigQuery Redshift Synapse
- Data lakes ADLS S3 GCS
- Experience with ETL orchestration tools
- Airflow ADF dbt Informatica or similar
- Strong SQL and debugging skills
- Experience with monitoring alerting and observability tools
- Knowledge of Linux Unix systems
Technology->DevOps->Site Reliability Engineering(SRE),Domain->Core Engineering->Plant Engineering->Reliability Engineering