We are seeking an AI Engineer to setup and manage secure on-premises AI infrastructure, with a strong focus on Security Operations Center (SOC) use cases. The role includes building RAG pipelines, working with vector databases, developing custom models, and creating intelligent tool-integrated chatbots/agents to enhance threat intelligence, incident response, and SOC automation.
Design, deploy, and maintain on-premises AI infrastructure for hosting large language models.
Build end-to-end RAG (Retrieval-Augmented Generation) pipelines tailored for SOC workflows.
Work with vector databases for storing and retrieving security embeddings efficiently.
Develop and fine-tune custom AI models (including domain-specific fine-tuning using LoRA/QLoRA).
Build intelligent chatbots and AI agents with tool integration (function calling, API integrations, external tool orchestration).
Set up high-performance inference servers (vLLM, TensorRT-LLM, Hugging Face TGI, Ollama, etc.).
Containerize and orchestrate AI workloads using Docker, Kubernetes, and Helm.
Integrate AI chatbots/agents with SOC tools (SIEM, EDR, ticketing systems, threat intelligence platforms, etc.).
Optimize models for GPU clusters with quantization, distillation, and performance tuning.
Ensure security, monitoring, scalability, and compliance of all on-prem AI systems.
Collaborate with SOC analysts, cybersecurity, and DevOps teams.
Education: Bachelor’s or Master’s in Computer Science, AI/ML, Cybersecurity, or equivalent.
Strong experience in building RAG pipelines (LangChain, LlamaIndex, Haystack, etc.).
Practical knowledge of vector databases (Chroma, Weaviate, PGVector, Milvus, Qdrant, FAISS, etc.).
Experience in developing custom models (fine-tuning open-source LLMs for specific domains).
Proven experience in building chatbots and AI agents with tool integration (function calling, ReAct agents, LangGraph, tool orchestration, API integrations).
Strong Python skills with PyTorch and Hugging Face Transformers.
Hands-on experience with GPU acceleration (CUDA, TensorRT, ONNX) and on-prem deployment.
Solid knowledge of Docker, Kubernetes, and Linux environments.