Company Details:
We are trusted as one of the leading IT enabled services provider, having a remarkable track record of consistently delivering workable and robust solutions. This becomes possible as we adopt continual innovation and remain committed to quality, implement and refine processes and leverage technological prowess. With the best software and hardware environments coupled with state-of the-art communication facilities; our offices are fully equipped to work as virtual extensions of clients’ environment, providing 24×7 services. Founded in 1997 in Ahmedabad, India – one of the fastest growing metros of India
- Branch offices in India, USA and Canada
- Multi-million US$ turnover with CAGR of 20%
- 1000+ certified and skilled professionals serving more than 300+ clients globally
- Offering end-to-end solutions to meet IT and ICT needs of clients
Designation: Lead Data Engineer
Experience: 8+ years
Work Location: Ahmedabad
KEY RESPONSIBILITIES:
1. Data Platform Architecture
- Design and architect scalable Lakehouse solutions using Databricks, Delta Lake, and Unity Catalog
- Build and maintain multi-layered data architecture (Bronze → Silver → Gold)
- Define standards for data modeling, partitioning, indexing, Z-ordering, and performance optimization
- Establish data governance, lineage, and metadata strategy
2. Data Pipeline Development
- Develop ETL/ELT pipelines using PySpark, Spark SQL, Scala, and Python
- Build batch and near real-time pipelines using:
- Structured Streaming
- Auto Loader
- Delta Live Tables (DLT)
- Implement advanced transformations, incremental processing, and SCD patterns
- Integrate pipelines with external systems (ADF/Glue/Kafka/Event Hub)
3. Databricks Platform Leadership
- Lead cluster configuration and optimization (job clusters, all-purpose clusters, Photon)
- Manage Databricks Workflows, Repos, Jobs, Secrets
- Improve platform reliability, observability, and cost optimization
- Set up and maintain Unity Catalog for access control and data governance
4. Cloud & DevOps
- Build CI/CD for Databricks using GitHub Actions, Azure DevOps, or GitLab
- Implement IaC using Terraform (clusters, jobs, catalogs, schema, permissions)
- Manage deployments across Dev/Test/Prod environments
- Implement automated testing (unit, integration, quality checks)
5. Leadership & Collaboration
- Lead and mentor a team of Data Engineers
- Work closely with Data Architects, Data Scientists, Analysts, and Product Owners
- Translate business requirements into scalable technical solutions
- Ensure timely delivery with high-quality engineering standards
SKILLS AND EXPERIENCE
- 8+ years of Data Engineering experience
- 3+ years hands-on with Databricks (mandatory)
- Strong in PySpark / Spark SQL / Scala / Python
- Deep understanding of Delta Lake, ACID, Z-Ordering, file optimization
- Experience with:
- Azure (ADLS, ADF, Event Hub) or AWS (S3, Glue, Kinesis) or GCP data services
- Strong experience with SQL (analytical, tuning)
- Knowledge of streaming data pipelines & event-based architecture
- Familiarity with Kafka, Event Hub, or Kinesis
- Strong understanding of Lakehouse, DataOps, and governance practices
DevOps & Tools
- Git branching strategies
- CI/CD pipelines for data workloads
- Terraform or similar IaC tools
- Logging & monitoring (Datadog, Azure Monitor, CloudWatch)
- Unit testing frameworks (pytest, dbx, assert libraries)
Job Type: Full-time
Pay: ₹600,000.00 - ₹1,500,000.00 per year
Work Location: In person