- Key Responsibilities
- Data Architecture for AI
- Architect AI data foundations including ingestion transformation enrichment and serving layers
- Design data architectures supporting RAG embeddings feature stores and training data pipelines
- Define standards for data quality lineage versioning and governance for AI workloads
- Ensure data platforms support scalability performance and low latency AI use cases
- Data Quality Assurance
- Architect data validation and testing frameworks for AI and analytics systems
- Enable automated validation for data correctness drift bias and completeness
- Define test strategies for data migration data transformation and AI readiness
- Collaborate with QE teams to embed data assurance into pipelines and platforms
- Platform Integration
- Integrate data platforms with AI services and analytics tools
- Define secure access patterns for data used in training inference and evaluation
- Enable observability for data pipelines and AI data consumption
- Guide teams on best practices for AI enabled BI and data driven systems
- Core Platforms Frameworks Tooling
- LLM and foundation model platforms e
- g
- AWS Bedrock Azure OpenAI Vertex AI
- Agentic AI and orchestration frameworks LangChain LangGraph CrewAI AutoGen Google ADK or equivalent
- CI CD and MLOps tooling for AI pipelines GitHub Actions Azure DevOps Jenkins
- Data ingestion and processing platforms Spark Kafka cloud native ETL ELT frameworks
- Data quality and validation frameworks Great Expectations Amazon Deequ custom reconciliation frameworks
- Feature stores and embedding pipelines Feast embedding generation pipelines vector databases
- Data drift bias and consistency monitoring tools Evidently statistical data quality monitors
- Metadata lineage and governance platforms DataHub Apache Atlas cloud data catalogs
- AI enabled analytics and Generative BI platforms Power BI with Copilot semantic layers NLQ enabled BI
- Cloud native data platforms and storage object storage distributed query engines data lakehouses
- Client Orientation Leadership
- Partner with product and engineering teams to identify Data for AI opportunities and shape roadmaps
- Support client workshops RFPs and solution presentations
- Mentor engineers on AI ML Gen AI best practices and emerging technologies
- Translate complex AI concepts into business friendly narratives
- Must Have Qualifications
- 13 years of experience in software engineering with 3 years in AI with strong architecture ownership
- Strong expertise in data engineering data quality and data governance
- Experience supporting AI use cases such as RAG feature engineering and model training
- Proficiency with data platforms cloud services and distributed data systems
- Solid understanding of QE practices related to data validation and testing
- Good to Have Skills
- Experience with Generative BI or AI assisted analytics
- Knowledge of metadata management lineage tools and data observability
- Exposure to AI ethics and bias in data sets
- Cloud data certifications
Technology->Machine Learning->Generative AI->retrieval augmented generation (rag),Technology->Data Engineering->Databricks,Technology->Data Engineering->Palantir Foundry,Technology->Data Management->Data Architecture->Data Architecture - Data Modeling,Technology->Embedded Software->Matlab,Technology->Agile Testing->Agile Testing - ALL->CD/CI,Technology->Integration->Confluent Kafka,Technology->Big Data - Data Processing->Spark