Project Role : Data Platform Engineer
Project Role Description : Assists with the data platform blueprint and design, encompassing the relevant data platform components. Collaborates with the Integration Architects and Data Architects to ensure cohesive integration between systems and data models.
Must have skills : Data Engineering
Good to have skills : Python (Programming Language), Generative AI
Minimum
5 year(s) of experience is required
Educational Qualification : 15 years full time education
Summary:
As a Data Platform Engineer, a typical day involves contributing to the development and refinement of the data platform blueprint and design, ensuring that all components align effectively. This role requires close collaboration with Integration Architects and Data Architects to maintain seamless integration between various systems and data models. The position demands active engagement in coordinating efforts across teams to support the overall data infrastructure, fostering a cohesive environment where data solutions are thoughtfully planned and executed to meet organizational needs.
Roles & Responsibilities:
- Expected to be an SME, collaborate and manage the team to perform.
- Responsible for team decisions.
- Engage with multiple teams and contribute on key decisions.
- Provide solutions to problems for their immediate team and across multiple teams.
- Lead the implementation of data platform strategies to ensure scalability and reliability.
- Facilitate knowledge sharing and mentorship within the team to support professional growth.
- Coordinate cross-functional efforts to align data platform initiatives with business objectives.
Professional & Technical Skills:
- Must To Have Skills: Proficiency in Data Engineering, Python (Programming Language), Generative AI.
- Strong experience with Python and SQL for data processing and pipeline development & Generative AI.
- Strong expertise in designing and managing data pipelines and workflows.
- Hands-on with Azure data services: Data Lake (ADLS), Data Factory (ADF), Synapse, Purview.
- Familiarity with Databricks (Delta Lake, Spark jobs, notebooks).
- Experience with vector databases and embedding pipelines for GenAI/RAG.Knowledge of data governance, security, and compliance frameworks.
- CI/CD experience (Azure DevOps, GitHub Actions) and containerization (Docker/Kubernetes basics).
- Exposure to LangChain/LangGraph, CrewAI, or similar agent orchestration frameworks.
- Experience with Azure OpenAI, Cognitive Search, and AI Studio.
- Familiarity with data quality frameworks (Great Expectations, Deequ).
- Knowledge of streaming (Kafka/Event Hubs) and real-time data products.
- Performance tuning for large-scale embeddings and distributed compute.
- Experience with data storage solutions and data modeling techniques.
- Familiarity with cloud-based data platforms and integration tools.
- Ability to optimize data processing for performance and scalability.
Additional Information:
- The candidate should have minimum 5 years of experience in Data Engineering.
- This position is based at our Pune office.
- A 15 years full time education is required.