editione1.0.0
Updated October 9, 2023There are many ways to document these plans—stick with what works for your internal culture and documentation style. Rather than define the document template, we will look at the sections you need to include and why they are important.
Like many of the subjects we have discussed in this book, just because something is an incident, it doesn’t mean the world is ending. Security isn’t always critical and that’s OK.
important Before you dig into the steps you need to take to respond to an incident, it’s important to define the levels of criticality associated with incidents. Like we mentioned when we discussed risk, defining these upfront allows you to prioritize and plan your actions based on likely impact, rather than your emotional response to a stressful situation.
Here are a set of example levels. They may not work for your organization, so it’s important to take a look at each and see what you need to adopt or adapt.
Each level should have a name, a description, and a definition of the impact this incident is having. This detail makes it easier to determine the level of an incident when they arise.
Level | Description | Example |
---|---|---|
Mission Critical | A serious event impacting large numbers of users for extended periods. This would include compromise that would cause large scale financial and reputation damage. Issues should be immediately escalated and addressed as a high priority. Business continuity actions and communications should be made ready. External specialists and law enforcement may need to be involved for security incidents. | • Database compromise. • Entire site outage affecting entire customer base, a large site, or the entire organization for an extended period (24 hours or more) • A critical or high severity vulnerability is made public |
Business Critical | An incident affecting a large number of customers across a wide range of activities. Issues are not remediated in half a working day (4 hours) For security incidents, this includes high risk vulnerabilities that have a high chance of exploitation (publicly known or received from a third party). Issue should be escalated and addressed. | • Issue affecting a number of customers, or a whole branch. • Private vulnerability disclosure or high potential of coverage in mainstream media. • CVSS 7 or above. |
Business Operational | An incident that affects a small group of customers and may affect their ability to complete activities. The issue is present for a short period of time. Issues should be escalated and prioritized. | • Issue affecting a small number of customers, a whole team, or isolated to a small number of data sets. • CVSS 5 or above issue in the software architecture. • Any issue that can be handled exclusively in working hours. |
Administrative | An incident that causes increased resource usage, mild customer discomfort, or confusion to a very small subset of customers. For security events, this would be a low-level security risk with a low likelihood of being exploited. No immediate action is required. | • Support issue. • Incident affecting only one customer/user or one data set, such as individual compromised accounts. |
Once you have your levels defined, they will become a guide to all initial incident responders during the initial stages of the incident response process .
As well as knowing how serious an incident is by defining its classification, we also need to define and simplify the roles we each play during incident response. Assigning and defining roles makes sure everyone knows what to do and avoids people all covering the same tasks (or all ignoring them and assuming someone else has it covered).
The following table is a set of typical incident response roles, their aim, and a brief summary of their responsibilities during an incident.
Remember, this definition stage isn’t about perfection, it’s about assigning responsibilities and removing ambiguity.
Role | Description | Responsibilities |
---|---|---|
Incident Response Owner | Owns this incident response plan and management level ownership of it and its associated risks. | • Update and maintain this document. • Arrange for regular tests of this process. |
Incident Lead | Controls and leads activities for a specific incident | • Lead the incident response team. • Coordinate response activities. • Manage prioritization during incident response. |
Deputy | Supports the Incident Lead and manages communications for a specific incident | • Manage communications with internal and external stakeholders. • Support the incident lead. |
Scribe | Records incident details for later reference | • Records the timeline of events during incident response. • Collects evidence to be used during post-incident review (screenshots, copies of log files, etc.) |
Comms Lead | Coordinates communication between team members | • Ensures that all team members are kept appropriately informed during the progress of the incident. • Escalates issues to the IT Manager when required. |
Privacy Officer | Manages privacy issues within your organization | • Must be informed of any incidents that involve a breach of private data. • Will liaise with the Privacy Commissioner if required. |
Incident response and management requires a number of coordinated roles to work efficiently. To ensure that your company is able to respond quickly, incident specific roles such as “incident lead” and “deputy” should be filled by people currently serving on the on-call roster, which is rotated regularly.
While we all like to think our companies are unique, we all secretly know that’s not the truth. There is something that makes your organization special, something your customers love, but many bits of how our companies operate are shared with other organizations around the world.
Identifying these common scenarios allows you to plan for them happening. In incident response we would normally create specific playbooks (as we discussed previously when we talked about policy, standards, and processes) to capture the specific actions our team needs to take if such an incident arises.
Here is a list of the most common scenarios. Feel free to use these as a suggested starting point for your organization’s scenario playbooks.
Scenario | Risks and Considerations |
---|---|
Lost computing device (laptop) | • Loss of sensitive information • Unauthorized device or systems access |
Account compromise (team member) | • Loss of confidential company data • Loss of data integrity • Attacker gains access to other systems or accounts |
Account compromise (single customer) | • Loss of confidential customer data • Loss of data integrity for individual customer • Security incident via support channel |
Account compromise (multiple customers) | • Loss of confidential customer data • Loss of data integrity for many customers • Security incident via support channel • Potential media interest |
Unauthorized systems access detected | • Loss of confidentiality/integrity |
Ransomware | • Systems disruption • Loss of data |
Virus detected | • Systems disruption • Loss of data |
File corruption or data loss | • Loss of data • Potential privacy breach • Potential systems availability issues |
Distributed Denial of Service Attack (DDOS) | • Loss of systems availability • Increase in support volume • Potential media interest |
So far we have defined the classification levels of our incidents based on their severity and impact, defined roles for our team to play, and looked at common scenarios that affect many companies around the world.
To make sure we turn all this definition and planning into action, we first need to understand how we would know if an incident was happening and what information sources would give us early warning.
We call these our incident notification sources and they are the places we need to be monitoring and connecting with frequently if we want to know something is happening as quickly as possible. Remember, you can’t respond to an incident until you know about it, so this is a pretty crucial step.
The following are some simple examples of incident notification sources. Remember that these sources are spread throughout your company, so it won’t always be your engineering or security team that are the first to know something bad is happening.
Type | Description |
---|---|
Alerting and Logs | One or more alerts have been received from an organizational or systems monitoring tool. |
Customer/User | A customer or user has contacted the organization to report an issue, suspicious behavior, or other concern. |
Responsible Disclosure | An individual or group has contacted the organization to report security vulnerability under the auspices of responsible disclosure. |
Third Party Notification | A notification has been received from any other third-party source, such as vulnerability notification sources or social media. |
Your company may have additional information sources, metrics, or contact points in addition to this list. Make sure you document each of those information sources and that the people who respond to or monitor them are aware of what they need to do should they encounter security messages or alerts in that channel.
While the steps outlined as examples in our overview of the incident response process are a good starting point, each incident scenario will have its own set of recommended actions and priorities. Creating documented playbooks for common incident scenarios can help you respond quickly and minimize the disruption of these events.
In this section, we will take a look at some common examples your company may face. You can use these as the basis for your playbooks or add new scenarios that are specific to your company or operating environment.