Job Summary
Our partner combines global reach with local expertise. They deliver local services and operate Service Centers and Integration Centers across Europe, South Africa, Asia, and the Americas, supporting users in over 70 countries.
What they do:
They help their customers to source, transform and manage their IT solutions and infrastructure to deliver digital transformation, enabling users & their businesses.
Their Ambition:
Strongly recommended by customers for the way they help them achieve their goals;
The preferred route to market for technology providers;
People want to join them and stay with them, proud of their reputation, as they learn, earn and have fun;
Trusted as an agile & innovative provider of digital technology around the world.
Job Description:
Our Cloud Services (CS) team is looking for a Private Cloud & Container Platform SRE Lead to lead a team of SRE’s, Operations Engineers & 3rd party providers to ensure our Private Cloud is operating effectively, constantly improving, automated as far as possible and that toil is eliminated over time. You’ll define service level indicators (SLIs) and objectives (SLOs) for the platform and lead the team to ensure they are met or exceeded and manage service level agreements (SLAs).
The role-holder will provide a voice to Private Cloud consumers, identifying and supporting the resolution of impediments and issues as well as providing transparency of service status, metrics, and performance.
This is a leadership-level role and will blend both deep domain and engineering expertise with a great passion for coaching and developing people in a “player-coach” model, as well as continually developing yourself.
You’ll provide feedback into the Product Owner backlogs and therefore development of the product roadmap in order to improve the public cloud service and will own the SRE-related stories in that backlog, with a particular focus on remediation of operational risk.
What you’d get involved with:
Owning the overall operations for our Private Cloud, including incident, problem, availability management, on-call, manual intervention, and removal of toil
Driving operational standards, developing the service levels and controls to support better consumer outcomes
Responsible for the teams providing incident support and point of escalation for service incidents
Partnering with the Engineering Lead to assign the SREs to the engineering scrums
Defining and prioritizing requirements for each cloud platform and the Operations Scrum teams
Supporting consumers through the Production Readiness Review quality gate
Implementing service dashboards and reporting using native tooling wherever possible
Implementation of opportunities to improve service levels and control of the cloud platforms
Relationship and performance management of Cloud Service Providers through service reviews, etc.
What’s needed to be considered for this job?
Experience managing an Enterprise scale IT service. Financial Services or other large enterprises and highly regulated environments.
Experience using DevOps tooling and associated delivery/support processes.
Understanding of agile delivery methodologies and DevOps, managing a squad of resources within that framework
Essential:
Must have leadership experience, ideally including line management and developing teams to excel and increase productivity
Experience working with a broad set of Cloud platforms and/or Private Cloud services and products
Experience in leading operational 24x7 support services, including incident management and defining and maintaining service levels
Managing 3rd party suppliers deliver critical services
Delivering consumer outcomes to support greater cloud adoption
Negotiator to influence technical and leadership decisions to achieve the right consumer outcomes and operational needs
A solid understanding of cloud security in regulated environments
Experience in managing risks and controls across technical platforms
Desirable:
Experience developing metrics to demonstrate performance improvement – SLIs, SLOs, etc.
Experience implementing dashboards for multi-cloud management