Overview
We are looking for a Data Engineer to design, build, and maintain scalable data pipelines and data infrastructure that power analytics, machine learning, and business intelligence across the organization. You will work closely with data scientists, analysts, and product teams to ensure reliable, high-quality, and accessible data across multiple domains.
Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines for structured and unstructured data from multiple sources.
- Build and optimize data models and data warehouses/lakes to support analytics and reporting needs.
- Ensure data quality, integrity, and consistency across production datasets and downstream consumers.
- Develop and maintain batch and real-time data processing systems.
- Work with stakeholders (product managers, analysts, data scientists) to translate business requirements into robust data solutions.
- Implement monitoring, logging, and alerting systems for data pipelines.
- Optimize query performance and data storage efficiency for large-scale datasets.
- Collaborate with ML teams to support feature engineering and data availability for models.
- Enforce data governance, security, and compliance standards.
Required Skills & Qualifications
- 3–7 years of experience in Data Engineering, Analytics Engineering, or Backend/Data Systems roles
- Strong proficiency in SQL and Python
- Experience with distributed data processing frameworks such as Spark or Flink
- Hands-on experience with data warehouses/lakes (e.g., Snowflake, BigQuery, Redshift, Databricks)
- Experience building ETL/ELT pipelines using tools like Airflow, dbt, or similar orchestration tools
- Strong understanding of data modeling, schema design, and relational databases
- Experience working with cloud platforms (AWS / GCP / Azure)
- Familiarity with API integration and data ingestion from multiple sources
- Understanding of data quality, testing, and validation frameworks
Preferred Qualifications
- Experience with streaming systems (Kafka, Kinesis, etc.)
- Exposure to MLOps or feature stores
- Experience with RAG pipelines, vector databases, or search systems (Elasticsearch, Pinecone, etc.)
- Experience in high-scale environments (TB+ data systems)
What You Will Work On
- Large-scale data pipelines powering product analytics and machine learning systems
- Real-time and batch data processing infrastructure
- Data platforms supporting business-critical reporting and decision-making
- Optimization of data systems handling multi-terabyte datasets
Pay: ₹1,500,000.00 - ₹2,000,000.00 per year
Benefits:
- Flexible schedule
- Paid time off
- Work from home
Work Location: Remote