Data engineer

Technology Next
Remote

Quick apply

Job details

Full-time | Contractual / Temporary
₹80,000 - ₹1,10,000 a month
3 hours ago

Qualifications

CI/CD
Performance tuning
Data modeling
Azure
Big data
Spark
Git
Google Cloud Platform
SQL
AWS
Terraform
Continuous integration
Scrum
Unity
ETL
Agile
S3
Apache
Kafka
Redshift
Data warehouse
Python

Full job description

Data Engineer – Databricks & PySpark

Location: Remote
Experience: 6–8 Years
Employment Type: Contract / Full-Time

Job Summary

We are seeking an experienced Data Engineer with strong expertise in Databricks and PySpark to design, develop, and optimize scalable data pipelines and cloud-based data platforms. The ideal candidate should have hands-on experience with big data technologies, cloud services, ETL/ELT processes, and modern data engineering practices.

Required Skills & Experience

6–8 years of experience in Data Engineering, Data Warehousing, and Big Data solutions.
Strong hands-on experience with Databricks, PySpark, and Apache Spark.
Expertise in building and maintaining large-scale ETL/ELT pipelines.
Strong proficiency in Python and SQL.
Experience with Delta Lake, Unity Catalog, and Databricks Workflows.
Hands-on experience with cloud platforms such as Azure, AWS, or Google Cloud Platform (GCP).
Experience with cloud storage solutions:
Azure Data Lake Storage (ADLS)
Amazon S3
Google Cloud Storage
Knowledge of data ingestion tools and frameworks.
Experience with Azure Data Factory (ADF), AWS Glue, or similar ETL orchestration tools.
Strong understanding of Data Lake, Data Warehouse, and Lakehouse Architecture.
Experience with Apache Kafka, Event Hub, or other streaming technologies.
Knowledge of CI/CD pipelines, Git, and DevOps practices.
Familiarity with workflow orchestration tools such as Apache Airflow.
Experience working with structured, semi-structured, and unstructured data.
Understanding of data modeling, partitioning, performance tuning, and optimization techniques.
Experience implementing data quality, governance, and security best practices.

Key Responsibilities

Design, develop, and optimize scalable data pipelines using Databricks and PySpark.
Build and maintain data ingestion, transformation, and processing frameworks.
Develop batch and real-time data processing solutions.
Implement Delta Lake and Lakehouse architecture best practices.
Collaborate with Data Scientists, Analysts, and Business stakeholders to deliver data solutions.
Optimize Spark jobs for performance, scalability, and cost efficiency.
Create and maintain data models, data marts, and enterprise data warehouses.
Implement monitoring, logging, and troubleshooting processes for data platforms.
Ensure data quality, governance, security, and compliance standards are maintained.
Participate in code reviews, architecture discussions, and technical design sessions.

Preferred Qualifications

Experience with Azure Databricks is highly preferred.
Knowledge of Snowflake, Redshift, BigQuery, or Synapse Analytics.
Experience with Infrastructure as Code (Terraform).
Databricks, Azure, AWS, or GCP certifications are a plus.
Experience in Agile/Scrum environments.

Mandatory Technologies

Databricks, PySpark, Apache Spark, Python, SQL, Delta Lake, Data Lake, ETL/ELT, Cloud Platform (Azure/AWS/GCP), Airflow, Git, CI/CD, Kafka, Data Warehousing.

Pay: ₹80,000.00 - ₹110,000.00 per month

Experience:

Big data: 5 years (Preferred)
Data warehouse: 4 years (Preferred)

Work Location: Remote

Quick apply

Jobseeker tools

Employer Tools

Browse

Stay Connected