Problem and Incident Management is a paired IT operations discipline. Incident Management restores normal service as quickly as possible after a disruption (reactive). Problem Management identifies and permanently eliminates the root causes of incidents to prevent recurrence (proactive). Together they form a closed-loop system: incidents generate data that feeds problem investigations, and problem resolutions reduce future incident volume. Within the ISACA CISA framework (Domain 4: Information Systems Operations and Business Resilience), this competency governs how organizations detect, classify, respond to, learn from, and permanently fix service disruptions across all IT systems supporting business operations.
Where it stops · what it isn't
- —IS: Incident Management — the end-to-end lifecycle from detection to closure of a service disruption, including triage, escalation, workaround, and resolution
- —IS: Problem Management — root cause analysis (RCA), Known Error Database (KEDB) maintenance, permanent fix identification, and change request initiation to prevent recurrence
- —IS: Post-Incident Review (PIR) / blameless post-mortem — structured retrospective to extract learning and prevent repeat incidents
- —IS: Incident prioritization and severity classification frameworks (P1/P2/P3 or Critical/Major/Minor) aligned to Business Impact Analysis
- —IS: Stakeholder communication protocols and escalation paths during active incidents
- —IS NOT: IT Change Management — Problem Management initiates change requests, but the change itself (CAB approval, deployment) is a separate process
- —IS NOT: IT Service Level Management — SLAs define the performance targets Incident Management must meet; SLA design and negotiation is a separate competency
- —IS NOT: Business Continuity / Disaster Recovery — BCP/DR addresses catastrophic, extended outages; Incident Management handles routine operational disruptions
- —IS NOT: Security Incident Response — cybersecurity incident response (forensics, containment, legal notification) is a distinct specialization governed by separate frameworks (NIST SP 800-61), though overlap exists for security-related operational incidents
- —IS NOT: Monitoring and alerting tooling configuration — these are enabling capabilities, not the management process itself
Connected concepts in the graph
Every cubelet sits in a knowledge graph. Here's what this one connects to.
PART OFInformation Systems Operations and Business Resilience (ISACA CISA Domain 4)
REQUIRESIT Change, Configuration, and Patch Management (permanent fixes must follow CAB process)Business Impact Analysis (BIA informs incident prioritization and severity classification)
ENABLESSystems Availability and Capacity Management (incident data informs capacity and availability planning)System and Operational Resilience (mature incident and problem management strengthens resilience posture)
RELATED TOIT Service Level Management (SLAs define response time targets for Incident Management)IT Asset and Configuration Management (CMDB provides asset context for incident triage)
CONSTRAINSPost-Incident Review process (blameless post-mortem standards govern how retrospectives are conducted)