Job Description:
• Lead and mentor a distributed team of DevOps, SRE, and Database engineers.
• Architect and operate secure, scalable, and cost-efficient Azure Cloud environments.
• Implement and optimize CI/CD pipelines, infrastructure as code (IaC), and observability platforms.
• Champion AIOps and AI-driven tooling (e.g., GitHub Copilot, Azure DevOps AI, intelligent alerting) to improve developer productivity and operational efficiency.
• Establish and enforce SRE practices — SLIs/SLOs, incident response, on-call processes, and postmortems.
• Oversee performance, scalability, and reliability of PostgreSQL, MySQL, SQL Server, CosmosDB, and Redis databases in production.
• Partner cross-functionally with product and engineering teams to align infrastructure with business priorities.
• Drive cost optimization, disaster recovery, and security compliance initiatives.
Requirements:
• 10+ years of experience in DevOps, Infrastructure, or SRE roles, including 3+ years of leadership experience managing multiple teams.
• Deep hands-on expertise with Azure Cloud, including networking, identity, security, and monitoring services.
• Proficiency in Kubernetes, Docker, Terraform, Azure DevOps, and CI/CD ecosystems.
• Proven experience managing relational and NoSQL databases at scale.
• Experience building observability stacks with Prometheus, Grafana, ELK, or Azure Monitor.
• Strong problem-solving, communication, and mentoring skills.
• Track record of integrating AI tools to reduce toil and improve operational insights.
• Nice to Have: Experience in multi-cloud environments (AWS or GCP). Familiarity with AIOps, MLOps, or GenAI-assisted automation. Experience working in regulated or enterprise-scale environments (e.g., finance, healthcare). Prior success in high-growth startups or scaling SaaS platforms.
Benefits:
• Medical, dental, and vision plans for you and your family
• 401(k) with company match
• Generous flexible PTO program and paid holidays
• Professional development opportunities
Apply Now
Apply Now