Druva, the autonomous data security company, puts data security on autopilot with a 100% SaaS, fully managed platform to secure and recover data from all threats. The Druva Data Security Cloud ensures the availability, confidentiality, and fidelity of data - providing customers with autonomous protection, rapid incident response, and guaranteed data recovery. The company is trusted by its more than 6,000 customers, including 65 of the Fortune 500, to defend business data in today’s ever-connected world. Amidst a rapidly evolving security landscape, Druva offers a $10 million Data Resiliency Guarantee ensuring customer data is protected and secured against every cyber threat.
We are seeking an exceptional Sr Cloud Operations Engineer as we enhance the support model for our SaaS platform. If you are eager to work in an environment that is fast paced, complex, large, new technologies, ensures cloud uptime, and enjoys being a team player and work effectively with other members of a global team, this position might be for you.
Roles and Responsibilities:
- Champion the cause of driving resolutions impacting production systems availability across a large microservices environment on AWS hosting Druva's SaaS platform services.
- Develop and maintain the overall cloud architecture, ensuring it aligns with the company's long-term goals and scaling requirements.
- Define and enforce best practices, standards, and procedures for cloud infrastructure provisioning, deployment, and management.
- Mentor and guide the CloudOps team in adopting advanced AWS services, features, and best practices.
- Collaborate with cross-functional teams, including DevOps, Security, Escalations and Engineering to ensure alignment and seamless integration of cloud operations with other aspects of the business.
- Lead the implementation and maintenance of advanced monitoring, logging, and alerting systems for proactive detection and resolution of infrastructure issues.
- Oversee and optimize the incident management process, ensuring efficient and effective response to production incidents.
- Collaborate with the security team to ensure compliance with security policies and industry standards.
- Drive the implementation of advanced cost optimization strategies and resource utilization optimizations to maximize efficiency and cost-effectiveness.
- Represent the CloudOps team in architectural and technical decision-making forums, ensuring the team's interests and concerns are addressed.
- Stay abreast of emerging technologies, trends, and best practices in cloud computing and evangelize their adoption within the organization.
- Contribute to the development and implementation of automation scripts, tools, and frameworks to streamline cloud operations and enhance team productivity.
- Foster a culture of continuous learning, knowledge sharing, and mentorship within the CloudOps team.
- Must Have Skills:
- Strong working experience with Linux (e.g., Red Hat, CentOS) environments.
- Developing automation solutions to streamline processes, such as creating scripts to run specific tasks on systems. Developing and implementing automation scripts to reduce repetitive tasks and eliminate human error.
- Strong in Network Infrastructures.
- Experience of Cloud providers including Amazon AWS, Google Cloud Platform, or Microsoft Azure.
- Strong knowledge of Docker containerization concepts, along with experience with the Docker CLI.
- Good To Have Skills:
- SQL / Database knowledge.
- Experience in Application Performance Monitoring tools such as Grafana, Splunk, PagerDuty, and Cloud Watch.
- Experience of cloud security best practices and experience implementing security measures.
- Cloud Certification such as AWS Cloud Solution Architect – Professional.
- Additional Information:
- Troubleshoot and resolve issues related to cloud infrastructure, ensuring high availability and performance.
- Must possess the ability to work independently in a fast-paced, dynamic environment.
- Must possess strong analytical and technical documentation skills.
- Must possess the ability to effectively present information and respond to questions.
- Ability to learn new technologies quickly with minimum support and guidance.
- Ability to think outside-of-the-box to generate creative solutions to problems.