How to build a cyber incident response team

Contents:

This post is authored by Heimdal’s Valentin Rusu – Machine Learning Research Engineer and overall cybersecurity guru here at Heimdal.

As an incident response manager himself, Valentin regularly coordinates security responses for companies of all shapes and sizes – including many of the examples discussed in this post.

He explains everything you need to know about building and prepping your incident response team.

Picture this: You’re responsible for IT at a fairly established mid-sized eCommerce company.

One day, the absolute worst happens: a ransomware attack. If it’s successful, you could lose terabytes of vital business information – including personal customer data.

The pressure is on – it’s possible to contain or avoid any damage – but you’ll have to isolate the ransomware before it spreads.

Would you know how to respond to this situation?

Every second counts in incident response

Nobody wants to be in this situation. But the reality is that organizations face crises like this all the time.

For incident response teams like the one I lead at Heimdal, this is just another day in the office.

I’ve seen just how crucial a few seconds can be in situations like this. With such a small margin for error, being prepared really matters.

Much of the cybersecurity work you do will focus on preventing this situation from happening in the first place.

But there’s no way to completely eliminate the risk.

That’s why it’s so important to have an incident response team and plan in place – so the response can be as quick and efficient as possible should disaster ever strike. So how do we do that?

In this article, I explain the basics of constructing an incident response team, as well as defining roles, policies, and playbooks.

The goal is to ensure everybody involved in the response can recognize the warning signs of an attack and knows how to respond when one occurs.

Here’s what you need to know.

Should you outsource or build your own cyber incident response team?

The first decision you’ll have to make is whether or not you’re going to have an in-house incident response team.

For the vast majority of businesses, this should be a fairly easy decision, as it is usually simpler and more cost effective to outsource, for a few key reasons:

Expertise – An incident response team requires a high level of cybersecurity expertise. Such talent can be difficult and expensive to hire;
Availability – You will also need to ensure 24/7 availability, requiring multiple specialists working shifts. This increases the total number of staff or employees you need;
Overheads – Outsourcing your incident response team removes the need for costs like salaries, overheads, and expenses.

That being said, there are some good reasons to construct your own team.

Large companies like IBM and Amazon will generally employ their own since the sheer scale of their systems and services makes outsourcing impractical.

This is also the case for companies like KPMG, which might be smaller, but have complex, highly customized IT environments.

We also employ our own full-time team at Heimdal; for a security company like us, it’s pretty much a requirement.

The good news is, unless you’re Amazon, KPMG, or Heimdal, outsourcing your incident response team is almost certainly the best choice for you.

This makes it easier, as you won’t be responsible for building and managing the team yourself.

That being said, it’s still important to know who’s on the team, how it works, and how they’re going to respond in a crisis.

Who do you need on your incident response team?

Of course, the most important part of your incident response is going to be the people. This is going to involve two separate but related groups:

1. Incident response team

This is the primary line of defense against more complex security incidents. The team’s role is highly specialized and proactive, focusing on identifying, assessing, and responding to security breaches or threats.

2. Support & monitoring team

The support team, structured in three levels of response, is often more operational and customer-focused.

They generally work in shifts, are available 24/7, and focus on the immediate effects of security incidents on users and systems, working quickly to restore normal service.

In reality, there can be significant crossover between the two incident response teams – and often they’ll work alongside each other for a particular response.

Here’s what you need to know about each one:

Incident response team

Whether you’re building your own incident response team or outsourcing, the basic setup is going to be the same.

These are the roles it’s important to include:

1. Security analyst – A security specialist with expertise across networking, malware, interface, and applications.

They are generally capable of understanding the most common warning signs for attacks and executing common responses. Unlike malware analysts, coding skills aren’t required to be a security analyst;

2. IT specialist – An IT specialist is similar to an IT admin.

Generally, they’ll spend most of their time on routine IT tasks like configuring Windows protocols, managing hardware and servers, setting up user accounts etc.

They’re not security experts per se, but they’ll be the go-to authority on how a company’s IT environment is configured and the endpoints within it – so they’re pretty invaluable in a crisis;

3. Malware analysts – These are the absolute subject matter experts.

They will almost always have coding skills and a deep understanding of malware tactics. Malware analysts are experts in penetration testing, having been trained with capture-the-flag exercises.

These essentially teach malware specialists to analyze and infiltrate IT environments in the same ways as a hacker would. They can also be referred to as ethical hackers;

4. Incident manager – This is the role I play in security response at Heimdal.

The incident manager generally has the same skillset as security analysts, but can also manage people and projects. This makes them ideal to coordinate the team and overall response;

5. Comms officer – Successful attacks will have a significant effect on your brand reputation.

Comms officers are responsible for communicating with customers, clients, and vendors where necessary. They communicate relevant information and, where possible, limit the reputational damage of a successful attack;

6. Legal advisors – These ensure the response follows legal guidelines.

This is hugely important as different countries will have varying rules about, eg, allowing remote 3rd party access to IT systems, responding to ransomware threats, and documenting evidence from a successful security incident.

This is a standard setup that most organizations or MSSPs will use. Some organizations might add other roles as necessary, depending on their specific needs and IT environment.

Monitoring and support team

As well as the incident response team, you will also need to ensure there’s a permanent, 24/7 support team in place. This team’s job is to implement preventative controls and spot the warning signs of a potential attack.

Generally, businesses should aim to have three levels of monitoring support, with each having different responsibilities and levels of expertise.

This structure is generally the same whether the team is in-house or outsourced:

Level 1: This involves front-line support staff or help desk personnel.

They’re not security experts and will spend much of their time implementing preventative tools and controls such as antivirus software, configuring secure network access, and implementing least privilege principles.

Much of what they do will be defined by clear company frameworks and playbooks (see more below);

Level 2: This is staffed by more experienced and specialized security experts.

They will do much of the ongoing monitoring and damage minimization, using tools like SIEM, SOAR, or XDR (more on this below).

They are responsible for quickly detecting and containing breaches, before restoring operations;

Level 3: These are the most specialized experts, generally involving malware and reverse engineering specialists.

They deal with in-depth incident analysis and develop strategies to prevent future attacks.

Highly unusual or critical attacks are generally escalated to the third level after level 2 specialists have implemented basic responses like isolating or switching off machines.

It’s worth pointing out that specific members of the incident response team might also work in the ongoing support. The distinction is less about who they are and more about the role they’re playing.

Crucially, a support team is always on, while the incident response team will form in the moments after a security incident has been detected.

It’s common to have security analysts and incident managers working in levels 2 and 3 of the support team.

When defining these teams, it’s also important to lay down clear policies for when, how, and to whom incidents should be escalated.

For example, a data loss of a terabyte should be escalated straight to the incident response team (and probably the CEO).

But a suspected phishing attack might be less serious – in that case, level 1 or 2 support staff can follow established processes to isolate the machine or restart the password.

The playbooks we discuss in the next section will come in handy here.

How to set up your incident response team for success

Choosing the right people for your incident response team is important, but the process doesn’t stop there. It’s also vital to make sure everybody in that team understands what their role is and how to respond should an incident occur.

The last thing you want in this situation is confusion about who’s doing what or the best way to mitigate the response.

There are three main steps to getting this right:

1. Select the right technology tools

When suspected attacks occur, speed is everything. The right technology can make your response quicker, more responsive, and more effective.

That’s why it’s so important to understand what technology is available and how best to use it. Generally, there are three types of common incident response tools:

1. SIEM (Security information and events management) – This is useful for real-time data analysis and log management. Generally, they’re used for visibility, real-time monitoring, and alerts. This is an organization’s first port of call for detecting potential security incidents, using signals like users logging in from a new location.
Choose this for Comprehensive network visibility;

2. SOAR (Security orchestration, automation, and response) – Ideal for automating responses to common threats and orchestrating complex workflows. SOARs are used to automate responses to common security signals – such as those you might define in playbooks or identify in your SIEM.
Choose this for Automating and streamlining detection and response;

3. XDR (Extended detection and response) – A newer tool that’s becoming an increasingly popular way to improve security responsiveness. Its extended functionality combines data from various security products to offer a more integrated and detailed view of the data and risks in your environment.
Choose this for: A more unified and informed response to security incidents, through advanced threat detection, investigation, and response.

As you can tell, each of these tools is designed for a slightly different job, so it’s not necessarily a case of choosing the best one. Instead, organizations will generally use a combination of tools to create a layered response.

One common combination is to use an SIEM for visibility and detection alongside a SOAR for automated response – and tools will generally be designed to integrate for this purpose.

An XDR can also be used as either a standalone product or to extend the functionality of an existing SIEM or SOAR. Ultimately, it all depends on the specific needs and technology in your organization.

2. Define playbooks and compliance policy

Once you have the team in place, the next step is to define policies and playbooks. The goal here is to ensure that everybody in the IRT and support team can effectively recognize the warning signs of an attack and how they need to respond.

These playbooks should clearly lay out the required remediation or mitigation steps for common attacks, as well as when to escalate and to whom. This is particularly useful for the first and second levels of support, who will be dealing with more predictable and regular attacks.

The challenge here is that it’s nearly impossible to design an effective policy for every possible incident. There are over 20 different types of ransomware attacks, just for starters.

At this point, you need to be careful not to make the advice too detailed or prescriptive; the more detail you include, the more likely it is that the detail won’t be relevant to the specific attack being faced. As ever with cybersecurity, the solution is a delicate balancing act.

Here’s an example of what your playbook might involve for, eg, a suspected ransomware attack:

Analyze relevant .log files;
Stop the computer, isolate the machine, and unplug it from the network;
Escalate to the third level, where the ransomware can be restarted in a safe environment for investigation.

This is a fairly standard playbook for ransomware and is roughly the level of detail you should go into at this stage.

3. Develop your compliance policy

Alongside your playbooks, you should also develop a compliance policy. This should clearly set out:

How and when to contact legal authorities;
Any post-incident document and reporting obligations;
Paying (or more likely, not paying) for ransomware;
Whether (and when) analysts can take remote control of 3rd party machines.

There are many confusing and overlapping legal standards to be aware of, and there will be different regulations depending on the specific jurisdiction you’re operating in.

For example, in Denmark, all data breaches have to be reported to the Centre for Cyber Security. It’s also illegal to pay ransomware in the EU, but in the US it’s more of a grey area.

These can be confusing and knotty issues, which is why it’s so important to discuss this with a legal professional when designing your policies – well in advance of an attack. This way, there’s no ambiguity about the correct response.

4. Create incident reporting and documentation templates

The last stage is to develop documentation templates for cyber incident response. In the aftermath of an attack, you’ll generally have to share particular information with customers, stakeholders, law enforcement authorities, and the wider cybersecurity community.

It helps to ensure that documentation can be quickly assembled once the incident is resolved or mitigated. Generally, security analysts will have clear reporting documentation and templates to work from. This ensures information is recorded and presented in a way that it can clearly demonstrate compliance while providing the right information to relevant stakeholders.

This documentation is usually filled out by level 2 support staff, sometimes involving information given to them by level 3 malware specialists. It should include:

Summary;
Incident overview;
Findings;
Timelines;
Type of attack.

You will also need to associate the incident with a specific Mitre Att&ck code – a globally recognized repository of tactics and techniques used in security incidents. This is the common language of security experts around the world.

Virtually every type of attack will have its own Mitre Attack code, along with a description and suggestions on common mitigation tactics. The code for phishing, for instance, is T1566. You can see the full list of identified tactics on the Mitre Att&ck home page.

Set up your cyber incident response team for success

Every organization hopes for the best-case scenario: never getting attacked at all. But the next best thing is having a thorough plan in place to identify, contain, and neutralize the threat – thus reducing its damage to an absolute minimum. Or ideally, none at all.

The steps I’ve discussed in this blog might not always be straightforward – but they can make a huge difference to the overall effectiveness of your response. When it comes to cybersecurity, it really does pay to be prepared for the worst.

FAQs: Building a cyber incident response team

What is a cyber incident response team?

An incident response team is a specialized group tasked with preparing for, detecting, and responding to cybersecurity incidents. They assess threats, mitigate damages, and aid recovery, ensuring organizational resilience against cyberattacks. Their focus is on maintaining security and minimizing the impact of breaches.

Who does a computer incident response team involve?

This team typically includes security analysts, IT specialists, malware analysts, incident managers, and legal advisors. Each member plays a distinct role, including monitoring and responding to threats, managing IT infrastructure, analyzing malware, coordinating response efforts, and ensuring legal compliance during and after security incidents.

What are the key differences between an in-house and outsourced incident response team?

In-house teams are usually found in larger companies with specific security needs, offering direct control and dedicated resources. Most companies will find it simpler and more cost-effective to outsource, due to the reduction in overheads and complexity of building the team.

Valentin Rusu

Machine Learning Research Engineer

Valentin is a Machine Learning Research Engineer at Heimdal with a Ph.D. in Artificial Intelligence. He is eager to share his knowledge of deep learning, computer vision, and natural language processing. His interesting blog posts show that he is willing to teach, making complicated advances in AI easy for a wide audience to understand.

CHECK OUR SUITE OF 11 CYBERSECURITY SOLUTIONS

How to build a cyber incident response team (a 2024 playbook)