About Adobe's Firefly Group
Join a team where ambition meets brand new ideas! Adobe's Firefly Group offers an outstanding opportunity to be at the forefront of innovative technology. As leaders in digital experiences, we are determined to transform how companies interact with customers across every screen. Our mission is to empower engineers and researchers by providing world-class cloud solutions for ML workflows, using the latest technology and frameworks. If you are passionate about crafting flawless digital experiences and are eager to compete in a dynamic environment, this is the place for you.
Position Summary
The AI Platform team builds the compute infrastructure that enables AI workloads at scale — spanning job scheduling, resource management, and the control plane that ASML engineers at Adobe depend on daily. We sit at the intersection of distributed systems and machine learning infrastructure, building the foundational services that power model training across Adobe Firefly.
We are seeking a seasoned Senior ML Platform engineer to join our ambitious team in Noida/Bangalore. This role is ideal for a technical leader who can own and evolve complex scheduling and compute orchestration systems, work closely with ASML engineers as the primary consumers of the platform, and drive architectural decisions that meaningfully improve developer productivity and system reliability.
What You'll Do
- Design, build, and maintain core services of the AI compute control plane: job scheduling, cluster management, resource quota enforcement, and compute lifecycle management
- Lead the design and implementation of job scheduling, resource quota enforcement, and compute lifecycle management systems
- Own the control plane services that manage GPU/CPU workload orchestration — from job submission through execution, monitoring, and teardown
- Design reliable, fault-tolerant worker services and supervisor patterns for long-running compute workloads
- Build and evolve the data layer that tracks job state, cluster state, and resource ownership across the platform
- Partner closely with ASML engineers to deeply understand their workflows and translate requirements into robust platform capabilities
- Develop and maintain Python SDKs and CLIs that ML engineers use to interact with the platform — prioritizing developer experience and reliability
- Drive end-to-end ownership of features — from API design and data modelling through deployment and production operations
- Establish observability standards (metrics, tracing, alerting) for scheduling and compute systems
- Lead incident response and root cause analysis for production issues in compute orchestration
- Mentor junior and mid-level engineers on system design, scheduling patterns, and platform engineering best practices
What We're Looking For
- B.Tech / M.Tech degree in Computer Science from a premier institute
- 9+ years of proven experience in backend platform engineering, distributed systems, or infrastructure software
- Strong computer science fundamentals — particularly in distributed systems, concurrency, and system design
- Experience building or operating job scheduling, workflow orchestration, or compute management systems (e.g. Argo, Airflow, Ray, Slurm, or similar)
- Proficiency in Python and/or Java, with strong async programming skills
- Experience designing and operating services backed by relational databases (PostgreSQL preferred) at scale
- Deep understanding of Cloud Platforms, with preference for AWS; familiarity with Azure or GCP is a plus
- Proven track record of working directly with internal engineering customers (ML engineers, researchers) to shape platform roadmap
- Strong problem-solving skills with the ability to own ambiguous, complex systems independently
- Experience with Kubernetes at the workload/scheduling layer (not just operations)
Good to Have
- Hands-on experience with ML training workflows, distributed training frameworks (PyTorch, TensorFlow), or GPU resource management
- Familiarity with gRPC/protobuf or event-driven architectures
- Experience building developer-facing internal platforms consumed by ML or research teams
- Prior work in an AI platform, MLOps, or compute infrastructure role
- Understanding of ML lifecycle — experiment tracking, model versioning, training pipelines
About Adobe
Adobe empowers everyone to create through innovative platforms and tools that unleash creativity, productivity and personalized customer experiences. Adobe’s industry-leading offerings including Adobe Acrobat Studio, Adobe Express, Adobe Firefly, Creative Cloud, Adobe Experience Platform, Adobe Experience Manager, and GenStudio enable people and businesses to turn ideas into impact, powered by AI and driven by human ingenuity.
Our 30,000+ employees worldwide are creating the future and raising the bar as we drive the next decade of growth. We’re on a mission to hire the very best and believe in creating a company culture where all employees are empowered to make an impact. At Adobe, we believe that great ideas can come from anywhere in the organization. The next big idea could be yours.
Let’s Adobe together
At Adobe, we believe in creating a company culture where all employees are empowered to make an impact. Learn more about Adobe life, including our values and culture , focus on people, purpose and community , Adobe for All , comprehensive benefits programs , the stories we tell , the customers we serve, and how you can help us advance our mission of empowering everyone to create.
Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other protected characteristic. Learn more.
Adobe aims to make our Careers website and recruiting process accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email
[email protected] or call +1 408-536-3015.
AI Use Guidelines for Interviews:
Our interviews are designed to reflect your own skills and thinking. The use of AI or recording tools during live interviews is not permitted unless explicitly invited by the interviewer or approved in advance as part of a reasonable accommodation. If these tools are used inappropriately or in a way that misrepresents your work, your application may not move forward in the process.
At Adobe, we empower employees to innovate with AI — and we look for candidates eager to do the same. As part of the hiring experience, we provide clear guidance on where AI is encouraged during the process and where it’s restricted during live interviews. See how we think about AI in the hiring experience .