We are seeking a hands-on System Administrator (Apps, DevOps & AI Infrastructure) to manage production systems, deployment pipelines, cloud/server infrastructure, mobile app release operations, and AI-related infrastructure/tooling. The role requires strong operational ownership, security awareness, automation skills, and availability for round-the-clock support and incident response.
The environment includes web and mobile applications, source control and CI/CD workflows, cPanel-based hosting, Linux servers, cloud services, and AI-enabled workloads/services.
Key Responsibilities
- Mobile App Release & Store Operations
- Manage Android and iOS application release processes for Google Play Console and Apple App Store Connect.
- Handle signing keys, provisioning profiles, certificates, build distribution, release notes, staged rollouts, and compliance requirements.
- Coordinate with development teams to troubleshoot submission, review, and deployment issues.
- GitHub & CI/CD Administration
- Administer GitHub repositories, access controls, branch protection rules, secrets management, and organization policies.
- Design, maintain, and optimize CI/CD pipelines (e.g., GitHub Actions, Jenkins, GitLab CI, Bitbucket Pipelines, or similar).
- Automate build, test, deployment, rollback, and release workflows.
- Implement environment promotion strategies (dev → staging → production).
- Server & Hosting Management
- Provision, secure, monitor, patch, and maintain Linux servers, VPS/cloud instances, containers, and related infrastructure.
- Manage cPanel/WHM hosting environments, domains, DNS, SSL/TLS certificates, email routing, backups, and resource optimization.
- Support web applications, APIs, databases, background workers, and scheduled jobs.
- Perform capacity planning, performance tuning, and disaster recovery preparation/testing.
- AI Infrastructure & Operations
- Support deployment and operation of AI-enabled services, inference workloads, vector databases, model-serving APIs, and related infrastructure.
- Manage API keys, service accounts, usage quotas, logging, and cost controls for AI platforms.
- Coordinate with engineering teams on GPU/cloud resource allocation and environment setup where applicable.
- Monitoring, Security & Reliability
- Implement centralized logging, metrics, alerting, uptime monitoring, and incident management.
- Maintain backup/restore procedures and recovery documentation.
- Enforce least-privilege access, MFA, secrets management, patching, vulnerability remediation, and audit readiness.
- Participate in post-incident reviews and preventive action planning.
- 24×7 Support & Operational Ownership
- Participate in 24×7 on-call support rotation for production incidents.
- Respond to critical outages, deployment failures, store release issues, SSL/DNS problems, and security events within defined SLAs.
- Provide clear status communication during incidents and maintenance windows.
Required Skills & Experience
- 3–7+ years in systems administration, DevOps, SRE, hosting operations, or equivalent.
- Hands-on experience with GitHub administration and CI/CD pipeline implementation.
- Experience deploying and maintaining applications on Linux servers and/or cloud platforms (AWS, Azure, GCP, DigitalOcean, etc.).
- Strong knowledge of cPanel/WHM, DNS, SSL/TLS, backups, email deliverability basics, and web hosting operations.
- Experience with Google Play Console and Apple App Store Connect release workflows.
- Understanding of AI service operations, API integrations, model-serving environments, or AI infrastructure tooling.
- Proficiency in shell scripting and automation (Bash, Python, PowerShell, or similar).
- Familiarity with containers/orchestration (Docker; Kubernetes is a plus).
- Strong troubleshooting, documentation, and communication skills.
Pay: ₹20,000.00 - ₹50,000.00 per month
Benefits:
- Flexible schedule
- Paid sick time
- Paid time off
- Provident Fund
Work Location: Hybrid remote in Lucknow, Uttar Pradesh (Lucknow)