- We are seeking an experienced Senior Data Reliability Engineer DRE with 9 years of expertise in ensuring the reliability availability scalability and quality of enterprise data platforms
- This role applies Site Reliability Engineering SRE principles to data systems and plays a critical leadership role in building resilient observable and highly available data pipelines supporting analytics BI and data science workloads
- Data Platform Reliability Stability
- Own end to end reliability availability and performance of enterprise data platforms
- Define and track SLAs SLOs and SLIs for data pipelines and platforms
- Proactively reduce data downtime failures and data quality incidents
- Design systems for fault tolerance scalability and resilience
- 9 years of experience in Data Engineering Data Platform SRE or Reliability roles
- Strong experience applying SRE principles to data systems
- Hands on experience with enterprise data platforms such as
- Snowflake BigQuery Redshift Synapse
- Data lakes ADLS S3 GCS
- Expertise with data orchestration and ETL tools
- Airflow ADF dbt Informatica or similar
- Strong SQL and deep data debugging skills
- Experience with monitoring and observability tools
- Prometheus Grafana Datadog CloudWatch Azure Monitor
- Strong scripting skills Python Shell
- Deep knowledge of Linux Unix systems
- Strong analytical troubleshooting and leadership skills
- Excellent communication and stakeholder management abilities
Technology->DevOps->Site Reliability Engineering(SRE),Domain->Core Engineering->Plant Engineering->Reliability Engineering