At Allstate, great things happen when our people work together to protect families and their belongings from life’s uncertainties. And for more than 90 years, our innovative drive has kept us a step ahead of our customers’ evolving needs. From advocating for seat belts, air bags and graduated driving laws, to being an industry leader in pricing sophistication, telematics, and, more recently, device and identity protection.
Job Description
Allstate’s Data & Analytics Technology organization is seeking a Data Engineer to design, build, and operate scalable, reliable, and high performing data pipelines that support enterprise analytics, reporting, and advanced data use cases. In this role, you will focus on building robust batch and streaming data solutions using Apache Spark and modern cloud data platforms. Experience with Microsoft Fabric is a strong plus.
You will work closely with analytics engineers, data scientists, product teams, and platform partners to transform raw, complex data into trusted, analytics ready datasets. This role plays a critical part in enabling data driven decision making by ensuring data quality, performance, scalability, and operational excellence across the data platform.
Primary Skill- Data Engineer, Apache Spark, ETL, Python
Location- Bangalore
Shift - 1pm to 9.30pm
Work from office (Hybrid)
-
Design, build, and maintain scalable batch and streaming data pipelines using Apache Spark and cloud‑native data technologies.
-
Develop and optimize ETL/ELT workflows to ingest, transform, and curate data from diverse source systems into analytics‑ready datasets.
-
Implement data modeling and transformation logic to support reporting, dashboards, and downstream analytical and machine learning workloads.
-
Build and manage data processing workloads within modern lakehouse platforms, including Microsoft Fabric / OneLake (preferred).
-
Ensure data quality, reliability, and consistency by implementing validation checks, monitoring, and reconciliation processes.
-
Optimize Spark jobs for performance, cost efficiency, and scalability across large and complex datasets.
-
Manage and evolve data schemas while handling schema drift and upstream source changes.
-
Develop reusable frameworks, libraries, and standardized patterns to improve data engineering productivity and consistency.
-
Implement CI/CD pipelines for data workloads to enable automated testing, deployment, and rollback.
-
Monitor data pipelines and jobs, troubleshoot failures, and resolve performance or data quality issues.
-
Partner with analytics engineers, BI developers, and data scientists to understand data requirements and deliver curated datasets.
-
Collaborate with platform, security, and governance teams to ensure data security, compliance, and proper access controls.
-
Contribute to Agile delivery processes, including sprint planning, design reviews, and continuous improvement initiatives.
Required Qualifications
-
Strong experience as a Data Engineer building and operating production data pipelines.
-
Hands‑on experience with Apache Spark for large‑scale data processing.
-
Proficiency in Python, SQL, and data transformation best practices.
-
Experience with cloud‑based data platforms and storage (e.g., Data Lakes, Lakehouse architectures).
-
Familiarity with Microsoft Fabric, OneLake, or similar analytics platforms (strong plus).
-
Experience designing and optimizing data models for analytical workloads.
-
Understanding of distributed data processing concepts, performance tuning, and fault tolerance.
-
Experience with CI/CD, version control, and infrastructure‑as‑code concepts.
-
Strong problem‑solving skills and ability to troubleshoot complex data issues.
-
Excellent communication skills and ability to collaborate across technical and non‑technical teams.
-
4+ years of experience in data engineering or equivalent role (preferred).
Preferred / Nice‑to‑Have Skills
-
Experience with real‑time or event‑driven data processing.
-
Familiarity with data governance, metadata management, and data quality frameworks.
-
Exposure to orchestration tools and workflow management systems.
-
Experience supporting analytical, reporting, or machine learning use cases.
Primary Skills
Apache Spark, Data Engineering, Data ETL, ETL Tools, Python (Programming Language)
Shift Time
Recruiter Info
Hiral Parag Rughani
[email protected]
About Allstate
Joining our team isn’t just a job — it’s an opportunity. One that takes your skills and pushes them to the next level. One that encourages you to challenge the status quo. One where you can shape the future of protection while supporting causes that mean the most to you. Joining our team means being part of something bigger – a winning team making a meaningful impact.
The Allstate Corporation is one of the largest publicly held insurance providers in the United States. Ranked No. 84 in the 2023 Fortune 500 list of the largest United States corporations by total revenue, The Allstate Corporation owns and operates 18 companies in the United States, Canada, Northern Ireland, and India. Allstate India Private Limited, also known as Allstate India, is a subsidiary of The Allstate Corporation. The India talent center was set up in 2012 and operates under the corporation's Good Hands promise. As it innovates operations and technology, Allstate India has evolved beyond its technology functions to be the critical strategic business services arm of the corporation. With offices in Bengaluru and Pune, the company offers expertise to the parent organization’s business areas including technology and innovation, accounting and imaging services, policy administration, transformation solution design and support services, transformation of property liability service design, global operations and integration, and training and transition.
Learn more about Allstate India here.