The job you are trying to view has expired. Please perform a new search to find current jobs.

Azure Senior Data Lead

HCLTech -
Bengaluru, Karnataka

Job details

8 days ago

Qualifications

CI/CD
Azure
Computer Science
Kubernetes
B.Sc
Spark
System design
Master's degree
SQL
AWS
Docker
Bachelor's degree
Attribution modeling
Distributed systems
Terraform
Continuous integration
SDKs
REST
Unity
APIs
S3
AI
Python
Identity & access management

Full job description

Azure Senior Data Lead

Bangalore, Karnataka

Job Summary

Design and build optimization capabilities for Databricks - spanning Spark tuning, cluster
right-sizing, job orchestration, DBU consumption, and Delta Lake storage.

Translate platform expertise into product features - detection rules, recommendation

engines, and safe automated actions for production environments.

Build POCs to validate optimization ideas, demonstrate value, and support pre-sales

engagements.

Partner cross-functionally with backend, AI/ML, and data engineering teams to ship features

end-to-end

Key Responsibilities

Design and build optimization capabilities for Databricks - spanning Spark tuning, cluster
right-sizing, job orchestration, DBU consumption, and Delta Lake storage.

Translate platform expertise into product features - detection rules, recommendation

engines, and safe automated actions for production environments.

Build POCs to validate optimization ideas, demonstrate value, and support pre-sales

engagements.

Partner cross-functionally with backend, AI/ML, and data engineering teams to ship features

end-to-end

Skill Requirements

Engineering experience; hands-on exp in Databricks in production.

Apache Spark internals - Catalyst optimizer, Tungsten engine, AQE, DAG scheduler, shuffle

behavior, partitioning, broadcast/sort-merge joins, data skew handling, and Spark 4.0
capabilities.

Databricks platform depth - Delta Lake (transaction log, OPTIMIZE, ZORDER, vacuum, liquid

clustering, schema evolution, time travel, CDC/merge), Lakeflow Declarative Pipelines, Unity
Catalog (governance, lineage, fine-grained access), Photon engine, Databricks Workflows,
Lakebase, and all cluster types (job, all-purpose, serverless SQL, serverless compute).

Databricks REST API & SDK - programmatic management of clusters, jobs, permissions, and

workspace configuration.

Performance tuning - Spark UI interpretation, physical plans, shuffle/skew/spill diagnosis,

join optimization, caching strategies, and Photon adoption decisions.

Cost optimization - DBU forecasting, cluster sizing, autoscaling policies, spot vs. on-demand

trade-offs, instance pools, job-vs-all-purpose decisions, predictive optimization, serverless
economics (Performance vs. Standard mode, serverless GPU, egress, DBU trade-offs).

Advanced Python & expert SQL; deep PySpark and Spark SQL internals.
Cloud platforms (AWS/Azure/GCP) - IAM, networking, storage (S3/ADLS/GCS), and cloud native services underpinning Databricks.
Experience with Docker, Kubernetes, Terraform, and modern CI/CD pipelines.
Strong fundamentals in data structures, algorithms, distributed systems, and large-scale

system design

MLflow, Mosaic AI ecosystem (Agent Framework, Agent Bricks, AI Gateway, Vector Search),
feature stores, Databricks SQL Warehouses, or Databricks Asset Bundles.

FinOps practices and cost-attribution models for data platforms.
Observability tools - Prometheus, Grafana, OpenTelemetry, Datadog.
Contributions to open-source Spark/Delta/Databricks projects

Other Requirements

Databricks certifications a plus

BS/MS in Computer Science, Engineering, or related field

Job Summary

Key Responsibilities

Skill Requirements

Other Requirements

Jobseeker tools

Employer Tools

Browse

Stay Connected