Job Summary
Architect role for an experienced data and analytics professional with deep expertise in GenAI concepts Databricks ecosystem Python PySpark and Amazon S3 to design secure scalable hybrid solutions. The architect will shape AI and BI capabilities using AI BI Genie and Databricks SQL ensuring robust data foundations and workflows that drive measurable business outcomes for global stakeholders.
Responsibilities
Design end to end data and analytics architectures that use Databricks SQL Databricks Delta Lake PySpark and Amazon S3 to deliver secure scalable and high performing analytical platforms for business teams across regions.
Lead solution design for GenAI enabled analytics use cases by defining data flows model strategies and orchestration patterns that connect AI BI Genie Databricks Workflows and downstream reporting tools.
Develop detailed reference architectures and blueprints that standardize the use of Databricks Workflows Delta Lake patterns and Python based transformations to accelerate future project deliveries.
Collaborate with product owners and business stakeholders to translate analytical pain points into clear architectural options that balance cost performance data quality and time to market.
Guide development teams on best practices for PySpark job design code structure data partitioning and performance optimization so that batch and streaming workloads remain reliable and efficient.
Establish robust data governance practices by defining naming standards security models access patterns data catalog usage and quality checks across the Databricks and Amazon S3 landscape.
Review solution designs and production implementations to identify risks in scalability resilience data integrity and security then propose concrete remediation actions and technical improvements.
Create architecture decision records and technical documentation that clearly explain design choices tradeoffs and operational guidelines for hybrid work model teams in different locations.
Partner with platform engineering and cloud operations teams to define resource sizing monitoring alerting and cost optimization strategies for Databricks clusters and Amazon S3 storage.
Drive adoption of GenAI basics by defining safe usage patterns prompt design guidelines and integration approaches that connect models with curated Delta Lake datasets.
Provide architectural oversight during development testing and deployment phases so that implemented solutions remain aligned with the approved target architecture and non functional requirements.
Mentor engineers and data professionals on modern lakehouse concepts including Delta Lake design time travel schema evolution and data lifecycle management for enterprise scale.
Evaluate new capabilities in Databricks AI BI Genie and Python ecosystems and recommend a pragmatic roadmap that improves productivity and business value without adding unnecessary complexity.
Work closely with information security and compliance teams to ensure that data solutions adhere to corporate controls regulatory obligations and privacy expectations across all environments.
Coordinate with project management to estimate effort identify dependencies and plan technical milestones that support predictable delivery of analytics products.
Engage with business stakeholders to showcase prototypes and reference solutions that demonstrate how responsible use of GenAI and advanced analytics can improve decisions and create positive societal outcomes.
Drive continuous improvement by capturing lessons learned from each implementation and updating architectural standards patterns and reusable components to benefit future projects.
Ensure that hybrid collaboration practices including documentation code reviews and design workshops enable effective contribution from team members regardless of location or schedule.
Qualifications
Possess twelve to sixteen years of progressive experience in data engineering analytics or architecture with substantial exposure to large scale enterprise environments.
Demonstrate expert level proficiency in Databricks SQL PySpark and Delta Lake including performance tuning data modeling and orchestration using Databricks Workflows.
Show strong hands on experience with Python for data engineering automation and integration tasks in cloud based data platforms.
Bring solid experience working with Amazon S3 for data lake design secure storage policies lifecycle management and integration with analytic engines.
Exhibit practical knowledge of GenAI basics and AI BI Genie including applying generative capabilities to analytics discovery explanation and user interaction patterns.
Display familiarity with modern BI and reporting ecosystems and how they integrate with Databricks and curated data models to serve self service analytics consumers.
Demonstrate thorough understanding of data governance concepts including metadata management data quality stewardship and role based access models.
Possess strong communication and stakeholder management skills to explain complex architectural concepts in clear language tailored to technical and nontechnical audiences.
Show experience working in hybrid work models using collaborative tooling documentation standards and remote friendly practices that maintain high delivery quality.
Display proven ability to mentor and guide technical teams through architectural reviews code assessments and structured feedback that elevate overall engineering practices.