COMPANY PROFILE
Bain & Company is one of the world’s top management consulting firms, partnering with ambitious organizations across 65 cities in 40 countries to achieve extraordinary results. Global Business Services (GBS) is a network of five hubs across India, Poland, Malaysia, Mexico, and Portugal — a 1,000+ strong team supporting operations, finance, tech, data analytics, and more. Our mantra: "shared innovation, seamless execution."
JOB SUMMARY
The Senior Associate Data Engineer is a hands-on technical role within the NGSS team focused on building and maintaining data pipelines, data models, and cloud data infrastructure. You will work closely with senior engineers to deliver reliable data solutions that support Bain case teams and clients globally. This role suits a data engineering professional with around 2 years of experience looking to grow in a high-impact consulting environment.
KEY RESPONSIBILITIES
Data Engineering — Pipelines, Modeling & Quality (80%)
- Build and maintain ETL/ELT pipelines to ingest and transform data from multiple sources into cloud data warehouses using tools such as Azure Data Factory, dbt, or Airflow.
- Write efficient SQL and Python scripts for data extraction, transformation, and workflow automation.
- Implement error handling, alerting, and incremental load strategies to ensure pipeline reliability and resilience.
- Develop and run Python-based data workloads and notebooks on Databricks using PySpark for large-scale data processing and transformation.
- Develop and maintain dimensional data models (star/snowflake schema) on Snowflake or Azure Synapse; support schema design and query performance optimization.
- Implement automated data validation checks and monitor pipelines for failures, drift, and anomalies.
- Maintain documentation of data sources, transformation logic, and data lineage to support governance requirements.
- Software Engineering & Cloud/DevOps (20%)
Apply software engineering best practices — version control (Git), modular code design, code reviews, and unit testing — to pipeline and transformation development. * Build and maintain data transformation workflows using dbt on cloud warehouses such as Snowflake or Azure Synapse; manage models, tests, and documentation within dbt projects.
- Develop and run Python-based workloads and notebooks on Databricks, leveraging PySpark for large-scale data processing.
- Deploy pipelines and infrastructure to Azure (ADF, ADLS, Databricks) using CI/CD pipelines (GitHub Actions).
- Monitor deployed solutions, troubleshoot incidents (L2/L3), and escalate per established service protocols.
KNOWLEDGE & SKILLS
Technical
- Proficient in SQL; solid Python skills for data engineering tasks (pandas, PySpark) and scripting.
- Hands-on experience with dbt for data transformation, testing, and documentation on cloud warehouses.
- Working knowledge of Snowflake or Azure Synapse for data warehousing — including schema design, query optimization, and role-based access.
- Experience with Databricks for large-scale data processing using PySpark and Python notebooks.
- Practical Azure experience: Azure Data Factory, Azure Data Lake Storage (ADLS), and Databricks on Azure.
- Familiarity with CI/CD pipelines (Azure DevOps or GitHub Actions) for automated testing and deployment of data workflows.
- Basic scripting with Bash or PowerShell for automation and environment setup.
Professional
- Clear communicator — able to document data flows and explain technical concepts to non-technical stakeholders.
- Analytical, organized, and self-motivated with strong attention to detail.
- Comfortable working in Agile/Scrum teams with full participation in sprint ceremonies.
EXPERIENCE & EDUCATION
- 1-3 years of hands-on experience in data engineering or data integration.
- Demonstrated experience building production ETL pipelines in a cloud environment.
- Bachelor’s or Associate’s degree in Computer Science, Information Systems, Engineering, Statistics, or equivalent.