Incident Report Templates
Incident reports are essential for engineering teams to document, analyze, and learn from production issues. A good incident report template standardizes how your team captures what happened, why it happened, and what you'll do to prevent it from happening again. Below you'll find free, copy-paste incident report templates in markdown โ including a basic incident report template, a post-incident review, an incident log template, a sample incident report template for security events, and more. Each template follows industry best practices from Google SRE, PagerDuty, and Atlassian, and works with GitHub, Jira, Confluence, or any markdown-compatible tool.
Table of Contents
Basic Incident Report Template
A quick-fill template for any type of incident. Covers all essential fields in a straightforward format.
# Incident Report
## Incident Details
| Field | Value |
|-------|-------|
| **Incident ID** | INC-XXXX |
| **Date/Time Detected** | YYYY-MM-DD HH:MM (UTC) |
| **Date/Time Resolved** | YYYY-MM-DD HH:MM (UTC) |
| **Duration** | X hours Y minutes |
| **Reported By** | @name |
| **Severity** | P1 / P2 / P3 / P4 |
| **Status** | Investigating / Mitigated / Resolved |
## Summary
_One or two sentences describing what happened._
## Affected Systems
- [ ] Service / system name
- [ ] Service / system name
## Impact
- **Users affected:** _Number or percentage_
- **Revenue impact:** _Estimated or N/A_
- **SLA breach:** Yes / No
## Timeline
| Time (UTC) | Event |
|------------|-------|
| HH:MM | Issue first detected |
| HH:MM | Alert fired / team notified |
| HH:MM | Investigation started |
| HH:MM | Root cause identified |
| HH:MM | Fix deployed |
| HH:MM | Incident resolved |
## Root Cause
_Describe the underlying cause of the incident._
## Resolution
_Describe how the incident was resolved._
## Action Items
| Action | Owner | Due Date | Status |
|--------|-------|----------|--------|
| Action item 1 | @owner | YYYY-MM-DD | Open |
| Action item 2 | @owner | YYYY-MM-DD | Open |
## Lessons Learned
- _What went well?_
- _What could be improved?_
- _What will we do differently next time?_Post-Incident Review (Postmortem) Template
A comprehensive blameless postmortem template inspired by PagerDuty and Google SRE best practices. Use this after an incident is resolved to conduct a thorough review.
# Post-Incident Review
## Metadata
| Field | Value |
|-------|-------|
| **Incident ID** | INC-XXXX |
| **Date** | YYYY-MM-DD |
| **Authors** | @author1, @author2 |
| **Status** | Draft / In Review / Complete |
| **Severity** | P1 / P2 / P3 / P4 |
## Executive Summary
_A brief, non-technical summary of the incident, its impact, and outcome. Suitable for sharing with leadership._
## Impact
- **Duration:** X hours Y minutes
- **Users affected:** _Number or percentage_
- **Revenue impact:** _Estimated dollar amount or N/A_
- **Support tickets filed:** _Count_
- **SLA breach:** Yes / No
- **Data loss:** Yes / No
## Detection
- **How was the incident discovered?** _Monitoring alert / Customer report / Team member_
- **Time to detect (TTD):** _Minutes from start to detection_
- **Monitoring gaps:** _Were there alerts that should have fired but didn't?_
## Timeline
| Time (UTC) | Author | Event |
|------------|--------|-------|
| HH:MM | @name | Monitoring alert fired for elevated error rates |
| HH:MM | @name | On-call engineer acknowledged and began investigation |
| HH:MM | @name | Identified root cause as failed database migration |
| HH:MM | @name | Rolled back migration, service recovering |
| HH:MM | @name | All systems nominal, incident resolved |
## Root Cause
_Detailed technical explanation of what caused the incident. Be specific โ link to commits, configs, or dashboards where relevant._
## Contributing Factors
- _Factor 1: e.g., Missing integration test for migration path_
- _Factor 2: e.g., Alerting threshold was too high to catch gradual degradation_
- _Factor 3: e.g., Runbook was outdated_
## What Went Well
- _e.g., On-call responded within 5 minutes_
- _e.g., Rollback procedure worked as expected_
- _e.g., Communication to stakeholders was timely_
## What Didn't Go Well
- _e.g., Took 30 minutes to identify the root cause_
- _e.g., No alert fired for the initial degradation_
- _e.g., Runbook didn't cover this failure mode_
## Action Items
| Action | Owner | Due Date | Priority | Status |
|--------|-------|----------|----------|--------|
| Add integration test for migration rollback | @owner | YYYY-MM-DD | High | Open |
| Lower alerting threshold for error rate | @owner | YYYY-MM-DD | High | Open |
| Update runbook with this failure scenario | @owner | YYYY-MM-DD | Medium | Open |
| Schedule blameless review with team | @owner | YYYY-MM-DD | Medium | Open |
## Appendix
- [Link to monitoring dashboard](#)
- [Link to relevant logs](#)
- [Link to related pull requests](#)
- [Link to Slack thread](#)Incident Log Template
A real-time, timeline-focused log for tracking events during an active incident. Use this as your running record while the incident is in progress.
# Incident Log
## Incident Overview
| Field | Value |
|-------|-------|
| **Incident ID** | INC-XXXX |
| **Status** | Open / Investigating / Mitigated / Resolved |
| **Severity** | P1 / P2 / P3 / P4 |
| **Start Time** | YYYY-MM-DD HH:MM (UTC) |
| **End Time** | YYYY-MM-DD HH:MM (UTC) or _Ongoing_ |
## Roles
| Role | Name |
|------|------|
| **Incident Commander** | @name |
| **Communications Lead** | @name |
| **Operations Lead** | @name |
| **Scribe** | @name |
## Running Timeline
_Update this log in real time as events unfold._
| Time (UTC) | Author | Action / Update |
|------------|--------|-----------------|
| HH:MM | @name | Incident declared โ elevated error rates on checkout service |
| HH:MM | @name | Paged on-call SRE team |
| HH:MM | @name | Identified spike in database connection timeouts |
| HH:MM | @name | Scaling up database read replicas |
| HH:MM | @name | Error rates returning to normal |
| HH:MM | @name | Incident resolved โ monitoring for recurrence |
## Current Status
_What is the latest known state? Updated as the incident progresses._
- **Impacted services:** _List_
- **Current mitigation:** _Describe_
- **Estimated time to resolution:** _If known_
## Communication Log
| Time (UTC) | Channel | Message |
|------------|---------|---------|
| HH:MM | #incidents Slack | Incident declared, investigation underway |
| HH:MM | Status page | Updated status to "Degraded Performance" |
| HH:MM | Email to customers | Notified affected customers of service disruption |
| HH:MM | #incidents Slack | Incident resolved, all systems operational |
## Escalation History
| Time (UTC) | Escalated To | Reason |
|------------|-------------|--------|
| HH:MM | @team-lead | P1 severity requires management awareness |
| HH:MM | @database-team | Need DBA expertise for connection pool issues |
## Post-Incident
- [ ] Hand off to post-incident review owner
- [ ] Schedule postmortem meeting
- [ ] Update status page to "Resolved"
- [ ] Notify stakeholders of resolutionSecurity Incident Report Template
A security-specific incident report with breach assessment, data exposure fields, and compliance notification requirements.
# Security Incident Report
> **Classification: Confidential**
## Incident Details
| Field | Value |
|-------|-------|
| **Incident ID** | SEC-XXXX |
| **Date/Time Detected** | YYYY-MM-DD HH:MM (UTC) |
| **Date/Time Contained** | YYYY-MM-DD HH:MM (UTC) |
| **Reported By** | @name |
| **Severity** | Critical / High / Medium / Low |
| **Status** | Detected / Contained / Eradicated / Recovered |
## Summary
_Brief description of the security incident._
## Attack Vector
- **Type:** _e.g., Phishing, SQL Injection, Credential Stuffing, Insider Threat, Supply Chain_
- **Entry point:** _e.g., Public API endpoint, employee email, third-party dependency_
- **Indicators of compromise (IOCs):** _IP addresses, hashes, domains, etc._
## Data Exposure Assessment
| Question | Answer |
|----------|--------|
| Was personal data accessed? | Yes / No / Unknown |
| Was personal data exfiltrated? | Yes / No / Unknown |
| Type of data exposed | _e.g., emails, passwords, payment info, PII_ |
| Number of records affected | _Count or estimate_ |
| Data encrypted at rest? | Yes / No |
| Data encrypted in transit? | Yes / No |
## Affected Systems
- [ ] System / service name โ _description of impact_
- [ ] System / service name โ _description of impact_
## Affected Users / Data
- **Internal users affected:** _Count_
- **External users affected:** _Count_
- **Data categories:** _e.g., PII, financial, health, credentials_
## Timeline
| Time (UTC) | Event |
|------------|-------|
| HH:MM | Suspicious activity first observed |
| HH:MM | Security team alerted |
| HH:MM | Investigation initiated |
| HH:MM | Threat contained |
| HH:MM | Eradication complete |
| HH:MM | Recovery complete |
## Containment Actions
- [ ] _e.g., Blocked malicious IP addresses_
- [ ] _e.g., Disabled compromised accounts_
- [ ] _e.g., Isolated affected systems from network_
## Eradication Steps
- [ ] _e.g., Patched vulnerability_
- [ ] _e.g., Rotated all affected credentials_
- [ ] _e.g., Removed malicious code / backdoor_
## Recovery Steps
- [ ] _e.g., Restored systems from clean backups_
- [ ] _e.g., Re-enabled services with enhanced monitoring_
- [ ] _e.g., Verified system integrity_
## Evidence Preserved
| Evidence Type | Location | Collected By |
|--------------|----------|-------------|
| Server logs | _path/location_ | @name |
| Network captures | _path/location_ | @name |
| Disk images | _path/location_ | @name |
## Notification Requirements
| Requirement | Deadline | Status |
|-------------|----------|--------|
| Internal leadership notification | Within 1 hour | Done / Pending |
| Legal team notification | Within 4 hours | Done / Pending |
| Regulatory notification (GDPR, etc.) | Within 72 hours | Done / Pending |
| Affected user notification | Per legal guidance | Done / Pending |
## Root Cause
_Detailed explanation of how the security incident occurred._
## Preventive Measures
| Action | Owner | Due Date | Status |
|--------|-------|----------|--------|
| Preventive action 1 | @owner | YYYY-MM-DD | Open |
| Preventive action 2 | @owner | YYYY-MM-DD | Open |Root Cause Analysis (5 Whys) Template
A deep-dive root cause analysis template using the 5 Whys methodology. Best for incidents where the surface cause is known but the underlying root cause needs systematic investigation.
# Root Cause Analysis โ 5 Whys
## Incident Reference
| Field | Value |
|-------|-------|
| **Incident ID** | INC-XXXX |
| **Date of Incident** | YYYY-MM-DD |
| **RCA Author** | @name |
| **RCA Date** | YYYY-MM-DD |
## Problem Statement
_Clearly define the problem. Be specific about what happened, when, and what the impact was._
> Example: "On 2025-03-15 at 14:32 UTC, the checkout service returned 500 errors for 45 minutes, affecting approximately 12,000 users and resulting in an estimated $50,000 in lost revenue."
## The 5 Whys
### Why #1: Why did the problem occur?
_Answer:_
---
### Why #2: Why did [answer to Why #1] happen?
_Answer:_
---
### Why #3: Why did [answer to Why #2] happen?
_Answer:_
---
### Why #4: Why did [answer to Why #3] happen?
_Answer:_
---
### Why #5: Why did [answer to Why #4] happen?
_Answer:_
---
## True Root Cause
_Based on the 5 Whys analysis, state the true root cause._
## Corrective Actions
_Actions to fix the immediate issue and address the root cause._
| Action | Owner | Due Date | Priority | Status |
|--------|-------|----------|----------|--------|
| Corrective action 1 | @owner | YYYY-MM-DD | High | Open |
| Corrective action 2 | @owner | YYYY-MM-DD | Medium | Open |
## Preventive Actions
_Actions to prevent this class of problem from happening again._
| Action | Owner | Due Date | Priority | Status |
|--------|-------|----------|----------|--------|
| Preventive action 1 | @owner | YYYY-MM-DD | High | Open |
| Preventive action 2 | @owner | YYYY-MM-DD | Medium | Open |
## Verification Plan
_How will you verify that the corrective and preventive actions are effective?_
- [ ] _e.g., Run chaos engineering test to simulate the failure mode_
- [ ] _e.g., Verify new alert fires within 2 minutes of threshold breach_
- [ ] _e.g., Review in 30 days to confirm no recurrence_Lightweight Incident Report Template
A minimal incident report template designed for startups and small teams. Covers the essentials without overhead.
# Incident Report
**Date:** YYYY-MM-DD
**Reported by:** @name
**Severity:** P1 / P2 / P3 / P4
## What happened?
_Describe the incident in plain language. What did users experience?_
## When did it happen?
- **Started:** YYYY-MM-DD HH:MM (UTC)
- **Detected:** YYYY-MM-DD HH:MM (UTC)
- **Resolved:** YYYY-MM-DD HH:MM (UTC)
- **Total duration:** X hours Y minutes
## Who was affected?
_How many users were impacted? Which services or features were degraded?_
## How bad was it?
| Severity | Definition |
|----------|-----------|
| **P1 โ Critical** | Complete outage or data loss |
| **P2 โ Major** | Significant feature unavailable, workaround exists |
| **P3 โ Minor** | Small number of users affected, low impact |
| **P4 โ Low** | Cosmetic issue or minor inconvenience |
**This incident was a:** P1 / P2 / P3 / P4
## What caused it?
_Describe the root cause. Link to the relevant commit, deploy, or config change if applicable._
## How did we fix it?
_Describe the resolution. What was the fix? Was it a rollback, hotfix, or config change?_
## What will we do to prevent it?
- [ ] Action item โ **Owner:** @name โ **Due:** YYYY-MM-DD
- [ ] Action item โ **Owner:** @name โ **Due:** YYYY-MM-DD
- [ ] Action item โ **Owner:** @name โ **Due:** YYYY-MM-DDWhat Is an Incident Report Template?
An incident report template is a predefined document structure that helps engineering teams consistently record the details of a production incident. Instead of starting from scratch every time something breaks, a template gives your team a clear framework to capture what happened, when it happened, who was involved, what the impact was, and what you'll do to prevent it from recurring.
Standardized incident reports improve your team's response process in several ways: they reduce the cognitive overhead during stressful outages, ensure critical details aren't forgotten, and create a searchable archive of past incidents that makes pattern recognition easier. Over time, this archive becomes one of your most valuable engineering resources โ a living record of how your systems fail and how your team learns.
Most engineering organizations use different templates for different purposes: an incident log for real-time tracking during the incident, an incident report or postmortem for the post-incident analysis, and sometimes a specialized template for security events or root cause analysis.
How to Write an Incident Report
Writing an effective incident report is a skill that improves with practice. Follow these steps to produce a thorough, actionable report:
Document immediately.
Start capturing details as soon as the incident is detected. Use an incident log template to record events in real time โ memory fades quickly, and timestamps become unreliable after the fact.Classify the severity.
Assign a severity level (P1 through P4) based on user impact, scope, and urgency. This determines the response cadence and who needs to be notified.Build a detailed timeline.
Reconstruct the sequence of events from detection through resolution. Include timestamps, who took each action, and what the result was. Pull from monitoring tools, chat logs, and deploy histories.Identify the root cause.
Go beyond the surface symptoms to find the underlying cause. Techniques like the 5 Whys can help you dig deeper. Be specific โ "human error" is never a root cause.Define concrete action items.
Every action item should have an owner, a due date, and a clear definition of done. Avoid vague actions like "improve monitoring" โ instead write "add alert for error rate above 5% on checkout service."Share with the team.
Circulate the report to all stakeholders. The goal is organizational learning, not blame. A blameless culture encourages honest reporting and leads to better prevention.Follow up on actions.
Schedule a review 2-4 weeks after the incident to verify that all action items have been completed. Incomplete follow-through is the most common failure mode of incident management.
Incident Severity Levels (P1-P4)
A consistent severity classification helps your team prioritize response efforts and set expectations for stakeholders. Here is a common framework used by engineering organizations:
| Severity | Definition | Examples | Response Time | Communication |
|---|---|---|---|---|
| P1 โ Critical | Complete outage or data loss affecting all users | Site down, data breach, payment processing failure | Immediate (within 15 min) | All-hands page, exec notification, status page update |
| P2 โ Major | Significant feature unavailable, workaround may exist | Search broken, uploads failing, degraded performance | Within 30 minutes | On-call team paged, status page update |
| P3 โ Minor | Small number of users affected, limited impact | Single region degraded, non-critical feature broken | Within 4 hours | Team Slack notification |
| P4 โ Low | Cosmetic issue or minor inconvenience | UI glitch, slow non-critical page, minor logging error | Next business day | Ticket created, addressed in normal sprint |
Incident Report vs Incident Log
Teams often confuse incident reports with incident logs, but they serve different purposes at different stages of the incident lifecycle:
| Incident Log | Incident Report (Postmortem) | |
|---|---|---|
| When | During the incident | After the incident is resolved |
| Purpose | Real-time coordination and tracking | Post-incident analysis and learning |
| Author | Scribe or incident commander (updates continuously) | Incident owner (written after resolution) |
| Content | Running timeline, status updates, decisions made | Root cause, impact analysis, action items, lessons learned |
| Tone | Quick, factual, shorthand | Detailed, reflective, analytical |
Best practice is to use both: start the incident log when the incident is declared, then use it as the primary input for writing the post-incident report. The Incident Log Template and Post-Incident Review Template above are designed to work together.
Linking Incidents to Pull Requests
A large proportion of production incidents trace back to code changes โ a merged pull request that introduced a bug, a configuration update that wasn't tested, or a dependency upgrade with breaking changes. Connecting your incident reports to the specific PRs that caused them dramatically improves your ability to understand what went wrong and why.
When your team can quickly trace an incident to the exact pull request that triggered it, root cause analysis becomes faster and more accurate. Over time, this linkage also reveals patterns: which types of changes are riskiest, which services are most fragile, and where your code review process has gaps.
This is where PR visibility tools become critical. If your team uses Slack for incident coordination, getting real-time notifications about pull request activity โ merges, reviews, CI status โ means you can correlate deploys with incidents as they happen, not hours later during the postmortem. PullNotifier sends GitHub pull request notifications directly to Slack, giving your team instant visibility into code changes that may be causing production issues.
FAQ
How quickly should an incident report be completed?
Aim to complete the initial incident report within 24-48 hours of resolution for P1 and P2 incidents. For lower-severity incidents, within one week is acceptable. The sooner you write the report, the more accurate the details will be. Start with the incident log you maintained during the event and expand from there.
What's the difference between an incident log and an incident report?
An incident log is a real-time running record maintained during the incident โ it captures timestamps, actions, and decisions as they happen. An incident report (or postmortem) is written after the incident is resolved and provides a more reflective analysis including root cause, impact assessment, and preventive action items. See the comparison section above for a detailed breakdown.
Should incident reports be blameless?
Yes. Blameless postmortems are a cornerstone of effective incident management. When people fear blame, they hide information and avoid reporting incidents โ which makes your systems less safe, not more. Focus on systemic causes and process improvements rather than individual mistakes. The question should always be "how did our systems allow this to happen?" not "who caused this?"
What severity level is a P1 incident?
A P1 (Priority 1) incident is the most severe classification, indicating a critical issue with widespread user impact. This typically means a complete service outage, data loss, a security breach, or a failure in a revenue-critical system. P1 incidents require immediate response, all-hands mobilization, and executive communication. See the severity levels guide for the full P1-P4 framework.
Who should write the incident report?
The incident report is typically written by the incident owner or commander โ the person who led the response. However, it should be a collaborative effort reviewed by everyone involved. The author compiles the timeline, analysis, and action items, then shares a draft for team feedback before finalizing. Some organizations rotate this responsibility to build the skill across the team.
Can I use these templates with GitHub, Jira, or Confluence?
Yes. All templates on this page are written in standard markdown, which is natively supported by GitHub Issues, GitHub Discussions, Jira (with markdown mode), Confluence, Notion, Linear, and most other project management tools. Simply click the "Copy" button on the Code tab of any template and paste it into your tool of choice.
Built byPullNotifierโ get GitHub pull request notifications directly on Slack.
Never miss a PR review, merge, or CI failure again.
PullNotifier
ยฉ 2026 PullNotifier. All rights reserved
