Responsible for defining, planning, measurement, and improvement of all aspects of the availability of IT services and ensuring that all IT infrastructure, processes, tools, roles etc are appropriate for the agreed availability targets. Will work with a team of service continuity focused managers to develop a practice of business aligned goals for our IT services & ensure that those goals are actively met and exceeded. Will be responsible for building technical proposals for improvement initiatives coming out of the Continuity team. Participate in business planning and strategy sessions. Full participation in Change control process with respect to Service Availability review.
What we look for:
- Degree in Information Technology or related field
- 10+ years in a related technology position (Incident, Problem or Service manager position, Operations Lead, etc.)
- Demonstrable experience with various monitoring, performance, or capacity tools
- Strong knowledge of Server, Storage, Network, Middleware, Application and Cloud technologies.
- Ability to work with remote colleagues
- Strong Knowledge of ITIL processes, particularly Incident and Problem Management
- Ability to communicate at all organizational levels
- Clear & concise business & technical writing
- Self-motivated; Ability to embrace change & adjust approach
- Strong ability to lead by example & coach team
- Utilize technical environment knowledge to assure services and components are designed and delivered to meet their availability targets.
- Provide a holistic view of the environment and make recommendations to improve overall service.
- Analyze and define IT Services to meet agreed upon Service Levels
- Design, develop, and manage appropriate monitoring and reporting of services as well as ensure that all IT services are designed and managed to meet agreed upon availability targets.
- Ensure that Availability Testing is reviewed and maintained throughout for our IT Services
- Work with various technology teams to identify and implement lessons learned from each event
- Provide leadership and training to reports and peers specific to Service Continuity functions