Monitor infrastructure performance, capacity, and reliability using Prometheus, Grafana, and related observability tools.
Implement security hardening, compliance controls, and patch management across cloud and on-premises environments.
Automate infrastructure provisioning and configuration using IaC tools such as Terraform and Ansible.
Manage password rotation and credential updates for OpenStack services (Keystone, Nova, Neutron, Glance, Cinder, Horizon), ensuring policy compliance and minimal service disruption.
Plan and execute security patches, hotfixes, and upgrades across OpenStack controller and compute nodes, including validation and rollback readiness.
Perform vulnerability assessment and remediation through security scanning, risk prioritization, and corrective actions.
Configure and support Azure Single Sign-On (SSO) using Microsoft Entra ID (Azure AD), including application onboarding, role mapping, and access management.
Troubleshoot SSO-related issues such as authentication failures, federation problems, certificate renewals, and identity synchronization.
Implement and manage Conditional Access Policies, Multi-Factor Authentication (MFA), and identity governance controls.
Troubleshoot complex infrastructure issues, perform Root Cause Analysis (RCA), and ensure timely incident resolution.
Support Disaster Recovery (DR) and business continuity initiatives.
Maintain technical documentation, including architecture diagrams, automation workflows, and SOPs.
Collaborate with global teams to standardize, automate, and optimize infrastructure operations
Hands-on experience in password rotation, credential management, patching, and vulnerability remediation across OpenStack and Linux environments.
Experience with Azure SSO administration using Microsoft Entra ID (Azure AD) and authentication protocols such as SAML, OAuth, and OIDC.
Knowledge of Conditional Access, MFA, and identity/access management best practices.
Experience with cloud platforms such as Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP).
Proven expertise in Infrastructure as Code (IaC) using Terraform and Ansible.
Scripting skills in Bash, Python, or PowerShell for automation and orchestration.
Strong Linux administration skills (RHEL preferred), including package management, troubleshooting, and log analysis. - Good to have.
Experience with monitoring and logging tools such as Grafana, Prometheus, ELK, and Loki.
Good understanding of networking concepts (DNS, DHCP, routing, firewalls, load balancing) – Good to have.
Familiarity with Docker and Kubernetes – Good to have.
Strong analytical, troubleshooting, problem-solving, and communication skills.