Talentica Software, started by industry veterans and ex- IITB grads, is a product engineering company that helps tech-native enterprises and startups turn their ideas into market-leading products. We deliver innovative, high-quality products at an accelerated pace by combining the product mindset of our human experts with the power of AI. Over the last 22 years, the company has worked with over 200+ startups, with most clients based in the US, ensuring many successful exits.
In 2022, Great Place to Work® recognized Talentica Software as India's Great Mid-Size Workplace
What we're looking for?
We are seeking a Senior Data Engineer with strong experience in building modern, scalable data platforms. You will work closely with customers, product teams, and engineering stakeholders to design and implement enterprise-grade data analytics solutions.
This role requires deep expertise in data engineering, data modeling, and lakehouse architectures, along with the ability to lead technical discussions, mentor engineers, and drive best practices.
What you’ll do?
Build enterprise-grade data analytics platforms from scratch.
Partner with customers on their product vision, roadmap and goals and design the high/low level technical designs.
Build modern data analytics platforms using Apache Iceberg v2/v3, Data lakehouse or Medallion architecture, AWS Sagemaker Studio/Microsoft Fabric/Databricks, Python, Spark, AWS Glue, Kafka, Debezium.
Design and Implement meta-data driven multi-source data integration pipelines for relational and non-relational databases, file systems to process GBs of data.
Implement PHI-compliant data processing with automated data anonymization and PHI data stripping.
Implement data quality frameworks and validation pipelines to grade data quality.
Data modelling, and database design to ensure sub-minute query performance.
Collaborate with tech experts, share learnings, do research and POCs to build reusable solutions.
Mentor peer engineers, lead code reviews, conduct architecture discussions, publish Architecture Design Review Documents. drive technical excellence.
Lead research, POCs, and prototypes to build reusable solutions and cool products in different domains like healthcare, media, IoT, e-commerce, mobile, networking, and lots more.
To be successful in this role, you should have
Qualification: BE/BTech in Computer Science from a recognized university.
Experience: 6 to 8 years of hands-on experience in building data platforms and data engineering solutions
Skills
Strong proficiency in Python and PySpark for large-scale data processing
Strong expertise in data modeling and schema design, including:
Star and Snowflake schemas
Normalized schemas (1NF, 2NF, 3NF)
Slowly Changing Dimensions (SCD Type 1/2/3)
Data Vault (Hubs, Links, Satellites)
Denormalized/Wide-table patterns
Strong SQL skills with experience in relational databases (PostgreSQL preferred). Proficient in analytical functions and joins (self, natural, left, right, inner, and outer joins), as well as window functions such as PARTITION BY, RANK, and related analytical operations.
Partition data by date/key, use columnar formats (Parquet/Iceberg), implement incremental processing with checkpoints, optimize Spark with broadcast joins and proper executor sizing, leverage data skipping with Z-ordering, and process in parallel with distributed compute (Spark/Glue/EMR).
Experience with AWS cloud services (S3, Glue, Athena, EMR, EC2, Sagemaker)
Experience working in fast-paced Agile environment, with strong attention to detail and commitment to quality
Hands-on experience with AWS Sagemaker Studio or Microsoft Fabric or Databricks platform on areas of Workspace, Clusters, SQL, Workflows, Catalog, Monitoring
Hands-on experience in Apache Iceberg or similar lakehouse technologies
Deep expertise in data warehouse and lakehouse architectures (Medallion, others) using Databricks, and Open-source stack
Knowledge of data quality frameworks and validation techniques
Must-Have Skills
Strong experience with Data Lakehouse/Medallion architectures
Hands-on data engineering experience using AWS SageMaker Studio or (Microsoft Fabric/ Databricks)
AWS Glue, EMR, Athena, and S3
Expertise in Apache Iceberg (v2/v3)
Strong programming skills in Python, Apache Spark, and SQL
Good-to-Have Skills
Experience with Debezium (CDC) and Apache Kafka
Ability to lead data modeling and schema design initiatives
Experience authoring Architecture Design Review (ADR) documents
What you’ll find here?
A culture of innovation: We only take up projects that challenge us to innovate. Our customers come to us for our technology expertise.
Endless learning opportunities: Continuous learning is baked into our DNA. You’ll always have the chance to learn new things and stay on top of the latest trends.
Talented peers: Work alongside engineers from IITs, NITs, BITS, and other premier institutions.
Work-life balance: We value work-life balance and offer flexible schedules, including remote work options, so you can thrive both professionally and personally.
A great culture: Our employees love working here! 82% recommend Talentica to their friends, according to Glassdoor. Join us, and you’ll see why!
Recognition & rewards: We don’t just work hard, we celebrate success. Your contributions won’t go unnoticed. We’ll make sure you're recognized for the amazing work you do.
Ready to Make an Impact?
Fill out the lead form below, and we’ll get in touch!