Key Responsibilities
Architectural Leadership: Design end-to-end distributed systems for data fusion, ensuring high availability (HA), fault tolerance, and seamless horizontal scaling.
Data Integration & Streaming: Architect low-latency data pipelines using Kafka for real-time stream processing and Spark for complex batch transformations.
Search & Discovery: Implement high-performance indexing and retrieval layers using Solr or Elasticsearch to power complex multi-dimensional queries.
AI Strategy: Integrate Generative AI and Agentic AI frameworks into the data lifecycle, enabling autonomous data cleaning, metadata generation, and intelligent insight discovery.
System Performance: Conduct deep-dive performance tuning across the stack, from JVM (Java/J2EE) optimization to Linux kernel-level tweaks and RDBMS query execution plans.
Security & Compliance: Ensure the platform adheres to rigorous security standards, including data-at-rest/motion encryption, RBAC, and privacy-preserving data fusion techniques.
Production Excellence: Lead the transition from POC to production, implementing rigorous monitoring, CI/CD, and site reliability engineering (SRE) practices for large-volume systems.
Technical Requirements
Core Engineering
Languages: Expert-level proficiency in Java/J2EE and a strong grasp of the Java memory model and concurrency.
Big Data & Streaming: Hands-on experience with Apache Spark (Scala/Python/Java) and Apache Kafka (Architecting producers, consumers, and KStreams).
Storage & Search: Advanced knowledge of RDBMS (PostgreSQL/Oracle) and NoSQL, alongside expert-level implementation of Elasticsearch or Solr.
Environment: Strong Linux/Unix administration skills, including shell scripting and system performance monitoring tools.
Emerging AI & Modern Paradigms
GenAI & LLMs: Understanding of RAG (Retrieval-Augmented Generation) architectures and fine-tuning strategies.
Agentic AI: Experience designing autonomous agents that can interact with data environments to perform complex tasks.
Vibe Coding: Familiarity with high-level, intent-based development and leveraging AI-augmented coding environments to accelerate the development lifecycle.
Systems Design
Proven track record of managing Large Volumes of Data (Petabyte scale).
Deep understanding of Reliability Engineering (Circuit breakers, pressure, retries).
Experience with Distributed Systems patterns (CAP theorem, Consensus algorithms, Sharding).