Key Responsibilities
Pipeline Development & Automation
- Design, build, and maintain CI/CD pipelines to automate deployment of DQ rules and data services across environments.
- Optimize data pipelines and jobs for efficiency, scalability, and enterprise-grade reliability within Azure and Databricks.
- Implement best practices in version control, testing, and automated deployments.
Data Quality & Rule Development
- Develop, optimize, and maintain DQ rules in PySpark/Python for self-serve capabilities.
- Design and implement profiling frameworks for rule generation and automated remediation.
- Ensure DQ frameworks align with enterprise standards, governance, and audit requirements.
UI & Self-Serve Integration
- Collaborate with front-end teams (Node.js/Angular) to enable rule configuration, validation, and monitoring via a low-code UI.
- Develop APIs and services to expose DQ results and outputs for dashboards and self-service tools.
Collaboration & Governance
- Partner with Data Stewards, GPOs, and business owners to translate requirements into technical solutions.
- Contribute to BRDs, SRDs, and design reviews for DQ rules and pipelines.
- Provide inputs on governance, compliance, and change management processes.
- Support periodic reviews and continuous improvements of pipelines, rules, and dashboards.
Required Skills & Qualifications
- Strong programming skills in PySpark, Python, and SQL.
- Hands-on experience with Databricks (clusters, notebooks, Delta Lake).
- Experience designing and implementing CI/CD pipelines (Azure DevOps / GitHub Actions / Jenkins).
- Familiarity with Node.js & Angular for UI integration.
- Solid understanding of data governance, DQ frameworks, and MDM concepts.
- Strong analytical, problem-solving, and stakeholder engagement skills.
- Cloud expertise, preferably Azure Data Lake, Data Factory, and Synapse.
Education & Experience
- Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.
- 4–8 years of relevant experience in data engineering, CI/CD, and DQ rule development.
- Proven expertise in Databricks, Azure data services, and low-code self-serve frameworks.