Job Description - Senior Azure DevOps / Cloud Engineer
Role: Senior Azure DevOps / Cloud Engineer
Location: Mumbai - onsite
Experience: 7-10 years overall, with 5+ years hands-on in Azure DevOps and cloud infrastructure
Role summary
We are hiring a senior Azure DevOps / Cloud engineer to embed with a large retail enterprise’s technology operations team. The role is hands-on and client-facing: you will own CI/CD and infrastructure automation, design secure and cost-efficient cloud architecture, drive incident response, and support InfoSec and audit requirements. You should be able to design under pressure, justify your decisions to senior stakeholders, and communicate clearly and concisely. This is a senior, design-and-own role. You will be expected to make architecture trade offs, set security posture, manage cost, and explain your reasoning to technical leadership.
Must-have skills and experience
• Azure (primary cloud): - 5+ years building and operating workloads on Microsoft Azure. Azure is the primary estate, so deep fluency in Azure services and terminology is essential (AKS, Azure VMs, Blob Storage, ACR, Azure DevOps Pipelines, Azure Application Gateway, Key Vault, Defender for Cloud).
• CI/CD and automation: - Designing and operating multi-stage CI/CD pipelines (Azure DevOps Pipelines / GitLab CI) - build, containerize, deploy across dev / staging / production. - GitOps with ArgoCD for continuous delivery and configuration synchronisation.
• Kubernetes (AKS): - Production AKS operation and architecture: cluster design, node pools, node selectors / taints / tolerations, resource requests and limits, HPA / cluster autoscaler / Karpenter, policy engines (Kyverno / OPA). - GPU workload scheduling: pinning GPU pods to GPU node pools, GPU resource quotas, isolating GPU vs non-GPU workloads for cost control. - Helm charts, OCI artifact management.
• Infrastructure as Code: - Terraform - reusable modules and templates with security and encryption baked in by default (encryption at rest, private networking, RBAC, IP whitelisting).
• Incident management and troubleshooting: - P1 incident handling end to end: detection, triage, RTO targets, runbooks, stakeholder communication and escalation. - Distinguishing infrastructure issues from application issues using metrics and logs. - Systematic triage for app-down and performance degradation: DNS, load balancer, TLS/SSL, security groups, service logs, API latency, caching, database response, CDN.
• Cloud cost optimisation (FinOps): - Proven cost reduction with measurable outcomes. Right-sizing, GPU/VM SKU selection, log-volume control, storage tiering and compression, data-transfer optimisation. - Budget alerts, monthly cost reviews, forecasting, and cost allocation / chargeback to resource owners.
• Security and compliance: - Defender for Cloud, perimeter and host security (firewall, EDR), VPN / private networking, network security groups, secrets management (Azure Key Vault / external secrets), RBAC, TLS/SSL lifecycle. - Hands-on with compliance and audit: PCI DSS, SOC 2, ISO 27001. Ability to design proactively to audit requirements rather than reacting to findings, and to align security posture with InfoSec teams with sound justification.
• Architecture and design: - Able to design a greenfield deployment end to end (network, security, deployment architecture) for a new application - including AI/ML workloads needing GPU instances - and walk senior stakeholders through the trade-offs.
• Observability: - Prometheus, Grafana, sidecar-based monitoring, centralized logging, alerting.
Soft skills (essential for this role)
- Clear, concise, structured communication.
- Genuine depth of understanding across the tools and services listed - able to explain decisions in detail.
- Intellectual honesty about what you have and have not done.
- Composure when discussing complex architecture and security with senior technical and InfoSec stakeholders.
- Client-facing maturity suitable for embedding in a large enterprise’s operations team.
Good to have
- Data pipeline experience (Blob -> Snowflake or similar).
- Retail / BFSI / regulated-domain exposure.
- Team leadership or mentoring of junior DevOps engineers.
- Scripting (Python / Bash) for automation and drift detection.
About the Company:
Pace Wisdom Solutions is a deep-tech Product engineering and consulting firm. We have offices in San Francisco, Bengaluru, and Singapore. We specialize in designing and developing bespoke software solutions that cater to solving niche business problems.
We engage with our clients at various stages:
- Right from the idea stage to scope out business requirements.
- Design & architect the right solution and define tangible milestones.
- Setup dedicated and on-demand tech teams for agile delivery.
- Take accountability for successful deployments to ensure efficient go-to-market Implementations.
Pace Wisdom has been working with Fortune 500 Enterprises and growth-stage startups/SMEs since 2012. We also work as an extended Tech team and at times we have played the role of a Virtual CTO too. We believe in building lasting relationships and providing value-add every time and going beyond business.