Founded in 1976, CGI is among the world's largest independent IT and business consulting services firms. With 94,000 consultants and professionals globally, CGI delivers an end-to-end portfolio of capabilities, from strategic IT and business consulting to systems integration, managed IT and business process services, and intellectual property solutions. CGI works with clients through a local relationship model complemented by a global delivery network that helps clients digitally transform their organizations and accelerate results. CGI Fiscal 2024 reported revenue is CA$14.68 billion, and CGI shares are listed on the TSX (GIB.A) and the NYSE (GIB). Learn more at cgi.com.
Job Title: Databricks Solution & AI Architect
Position: AC
Experience: 10+ years of experience
Job location: Bangalore / Chennai
Position ID: J0526-2137
Work Type: Hybrid
Employment Type: Full Time / Permanent
Qualification: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Looking for Senior Databricks architect with deep expertise in Lakehouse platform design, data engineering at scale, and AWS cloud integration. This role leads solution design and delivery of enterprise data products — with hands-on capability to implement AI/GenAI features where they add clear business value.
The right candidate owns the full data platform: from medallion architecture and Unity Catalog governance to performance-tuned pipelines and stakeholder-facing analytics. AI capability is a meaningful differentiator, not the primary mandate — applied where data readiness and ROI are established.
Primary Responsibilities
1. Core Platform & Architecture (Primary Focus)
- Lead end-to-end design of Databricks Lakehouse solutions — medallion architecture (Bronze/Silver/Gold) optimised for performance, cost, and downstream consumption.
- Design and build production-grade data pipelines using PySpark, Databricks SQL, Auto Loader, and Delta Live Tables.
- Architect event-driven and batch ingestion frameworks with strong SLA management, data quality rules, and lineage tracking.
- Define Data Product blueprints: domain boundaries, ownership, contracts, metadata standards, and discoverability.
- Lead Unity Catalog implementation — fine-grained permissions, data lineage, row/column security, and cross-workspace governance.
- Own AWS integration: S3 bucket design, IAM roles, VPC networking, and secure credential management for Databricks workspaces.
- Drive DevOps practices using Git, Databricks CLI/SDK, Repos, and CI/CD pipelines for infrastructure and pipeline deployments.
- Optimise Spark workloads using Photon, Serverless compute, and cluster tuning for cost efficiency.
- Build and publish Power BI dashboards — DAX measures, data models, DirectQuery/Import mode — for business stakeholders.
- Lead architecture reviews, performance tuning, and code quality standards across engineering teams.
- Evaluate new Databricks platform features and integrate them into the technical roadmap.
2. AI / GenAI Enablement (Supporting Focus)
- Implement Databricks AI/BI Genie spaces to enable natural language querying for non-technical business users.
- Design and deploy RAG pipelines using Databricks Vector Search integrated with Delta Lake for real-time data freshness.
- Productionise ML models and AI agents using MLflow, Feature Store, and Mosaic AI Model Serving.
- Support AI use case discovery workshops — identifying where data is AI-ready and scoping feasibility of RAG or Agentic workflows.
- Implement LLMOps basics: monitoring, evaluation loops, and CI/CD for AI model assets.
- Apply Unity Catalog governance to AI assets — model permissions, vector index access controls, and audit trails.
Technical Skills Required
Databricks & Data Engineering
- PySpark, SQL, Python — production pipeline mastery
- Delta Lake, Auto Loader, Delta Live Tables
- Unity Catalog — governance, lineage, security
- Databricks Workflows & SQL Warehouses
- Medallion architecture & Data Mesh principles
- Performance tuning: Photon, Serverless, caching
- Databricks CLI/SDK, Repos, CI/CD integration
- Data Product design: contracts, versioning, SLAs
- Power BI: DAX, DirectQuery/Import, data models
AWS Cloud & DevOps
- S3, IAM, VPC, Glue, Lambda, Step Functions
- CloudWatch — monitoring, alerts, observability
- Terraform — workspace, clusters, UC policies, jobs
- Git-based CI/CD for infra and pipelines
AI / GenAI (Working Knowledge)
- Mosaic AI, MLflow, Feature Store, Model Serving
- Databricks Vector Search & RAG pipeline design
- AI/BI Genie — space setup and query tuning
- LLM integration basics (Bedrock / Model Serving)
- AI code assistants: Copilot, Gemini, Claude Code
Qualifications
- 8–12 years of professional experience in Data Engineering and cloud architecture.
- Minimum 4+ years hands-on with Databricks at production scale.
- Demonstrated delivery of Data Products on cloud Lakehouse platforms (AWS preferred).
- Proven experience leading RFP technical tracks, scoping complex projects, and producing effort estimations.
- 1–2 years exposure to productionising AI/ML solutions on Databricks is a strong advantage.
Soft Skills
- Strong ownership mindset — takes end-to-end accountability for platform reliability and data quality.
- Excellent communication: able to translate complex architecture into clear business value for C-suite and product stakeholders.
- Engineering leadership: mentors teams, drives code standards, and influences architectural decisions at an enterprise level.
- Pragmatic approach to AI — champions use cases where data readiness exists; avoids over-engineering.
- Continuous innovation mindset with focus on automation, performance optimisation, and platform reusability.
CGI is an equal opportunity employer. In addition, CGI is committed to providing accommodation for people with disabilities in accordance with provincial legislation. Please let us know if you require reasonable accommodation due to a disability during any aspect of the recruitment process and we will work with you to address your needs.
#LI-GB9