What awaits you/ Job Profile?
Conversational AI is our external agent platform, enabling intelligent, scalable interactions across digital channels.
We are seeking a Fullstack GenAI Developer to join our BMW Techworks teams of highly skilled specialists responsible for the design and develop GenAI Solutions
Our services are primarily built and operated on the Azure Cloud Platform. In this role, you will take a hands-on position, contributing to architecture, implementation, and operational excellence while actively sharing knowledge with other team members
This position is ideal for a GenAI Developer who has hands-on experience on Python, FastAPI, LLM API, MCP , A2A Protocol and Langchain and LangGraph
If you are enthusiastic about the latest advancements in Multimodel AI and bring energy and ambition to your work, and are comfortable working hands-on with complex systems, this role offers the opportunity to make a meaningful impact.
What should you bring along?
- 3 to 6 years of backend software development experience
- 2 to 3 years of hands-on experience building LLM/GenAI applications
- Design and implement scalable backend APIs for GenAI applications.
- Build multi-agent AI systems using graph-based orchestration.
- Develop Retrieval-Augmented Generation (RAG) pipelines.
- Implement MCP-compliant tool servers and A2A-enabled agent communication.
- Integrate LLMs with internal and external tools securely.
- Build responsive frontend applications for AI interactions.
- Deploy AI systems to cloud environments with proper monitoring and observability.
- Optimize AI systems for latency, scalability, and cost efficiency.
- Implement evaluation frameworks and guardrails for production-grade AI reliability.
Must have technical skill
Backend & Core Engineering
- Advanced proficiency in Python
o Async programming (async/await)
o Type hints, Pydantic
o Performance optimization
- Strong experience building APIs using FastAPI
- RESTful API design and WebSocket implementation
- Solid understanding of OOP and system design principles
Generative AI & LLM Frameworks
- Hands-on experience with:
o LangChain
o LangGraph (graph-based multi-agent orchestration)
o Tool calling agents
o Memory systems
o ReAct / Plan-Execute patterns
- Strong prompt engineering skills
- Experience building RAG systems:
o Embeddings
o Vector databases (e.g., Pinecone, Weaviate, FAISS)
o Chunking and retrieval optimization
Agent Protocols & Interoperability
- Practical experience with:
o MCP (Model Context Protocol)
o A2A (Agent2Agent communication)
o Tool servers
o Secure agent-to-agent communication
o Context management strategies
- Multi-agent workflow orchestration using graph-based execution
Integration
- Integration of frontend with backend APIs
- Streaming responses (SSE/WebSockets)
- Authentication flows (JWT/OAuth)
Data & Infrastructure
- Redis (caching/session management)
- Vector database integrations
- Docker-based containerization
- Cloud deployment (AWS / d Azure)
- Logging, monitoring, and tracing for AI systems
Good to have technical skills
- Fine-tuning LLMs (LoRA / PEFT)
- Model serving frameworks (vLLM, TGI)
- Experience with open-source LLMs (Llama, Mistral)
- Event-driven architecture
- Message brokers (Kafka, NATS)
- Workflow engines (Temporal, Prefect)
- Prompt injection mitigation techniques
- Role-based access control (RBAC)
- AI system evaluation frameworks