The Incident and problem management (IPM) specialists will play a key role in ensuring that the incidents and problems are properly addressed and accurately document the events from the onset of the major incident through restoration of service. The specialists will focus on ensuring proper root cause analysis occurs after any and all major incidents and managing and resolving issues raised by the IT Service Operations team. The position requires frequent interaction with other sections in the organization and third party providers. It is important for this individual to be able to build relationships quickly and drive escalations during high pressure scenarios. Post-Incident, the IPM Specialist will provide input into processes and planning to help drive internal improvements.
Duties and Responsibilities
- Initiate and lead the IT Governance Critical Incident Management calls/tickets, gathering required resources to remediate the issue as quickly as possible.
- Provide business impact updates to stakeholders and leadership as required (in verbal and written form).
- Prepare status updates and incident reports.
- Capture Incident details and update all necessary tools and documents (KEDB).
- Provide required notifications and updates on all Operational Critical Incidents within established service levels.
- Participate in Post Mortem meetings and work with IT Service Operations and Change Management (CM) Specialists to drive technical teams to define root cause.
- Coordinate with support supervisors, technical experts and developers to ensure swift resolution of Operational Critical Incidents.
- Ensure that Incident & problem Management KPIs are recorded and their targets met
- Ensure that the detection, initial diagnosis and prioritization of all incidents is effectively and consistently applied.
- In cooperation with IT Service operations and or CM specialists, conduct post incident analysis and ensure accurate root cause of incidents is captured and appropriate preventive actions are identified and tracked.
- Make recommendations for service process improvement plan
- Design and continually improve processes and metrics.
- Work closely with CM Specialists on major incidents to correlate them to requested changes.
- Proactively identify and remediate major incident causes (major incident avoidance).
- Participate in Incident Management process improvements.
- Document incident details for input into Problem Management's root cause analysis process, management reporting metrics and follow-up major incident reviews.
- Utilizes ITIL knowledge to review incidents during audits to ensure Incident Management process is being adhered to.
- Drive problem analysis to conclusion to determine root cause for any Severity 1 or High Impact incidents as well as for any recurring incidents
- Maintains the knowledge base and or troubleshooting guides for support.
Skills and qualifications
- Operational experience within a mission critical environment
- Understanding of Service Level Agreements and their application
- Good analytical skills, structured and methodical approach
- Ability to work independently and make decisions where necessary
- Experience interacting with all levels of management
- Experience participating in Incident Post Mortems and/or Root Case Analysis investigations
- Ability to meet challenging targets within tight deadlines
- Experience creating and improving processes (Primarily Incident Management)
- ITIL Foundation Certified preferred
- Experience in writing process plans and documentations
- Experience in Quality testing and applying technical skills in related to product / system testing.
- Exceptional communication skills – both written and verbal.
• Bachelor’s degree in IT Management, Industrial Engineering, or related field.