Development of Incident Response
Playbooks and Runbooks for Amazon Web
Services Ransomware Scenarios
Master’s Degree Programme in Information and Communication Technology
Department of Computing, Faculty of Technology
Master of Science in Technology Thesis
EIT Digital Master School
Cyber Security
Author:
Samuele Gandini
Supervisors:
Seppo Virtanen
Petri Sainio
University of Turku
September 2023
The originality of this thesis has been checked in accordance with the University of Turku quality assurance system
using the Turnitin OriginalityCheck service.
UNIVERSITY OF TURKU
Department of Computing, Faculty of Technology
Samuele Gandini: Development of Incident Response Playbooks and Runbooks
for Amazon Web Services Ransomware Scenarios
Master of Science in Technology Thesis, 105 p.
September 2023
In today’s digital landscape, enterprises encounter myriad cybersecurity challenges
that jeopardize their critical digital assets.
Modern cyber threats have evolved drastically, adapting to the proliferation of cloud
technologies that drive organizations towards platforms like AWS that offer conve-
nience, cost-reduction, and reliability. However, this transition introduces new secu-
rity risks because threat actors are motivated to craft and deploy advanced malware
explicitly targeting the cloud.
Ransomware emerged as one of the most impactful and dangerous cyber threats, still
in 2023, encrypting data and demanding payment (usually in untraceable tokens)
for the decryption key. Confidentiality, integrity, and availability of cloud assets
stand perpetually vulnerable, and sometimes, unprepared businesses suddenly hit
by ransomware cannot find a way out. Besides financial loss and operation disrup-
tion, the breach of sensitive information compromises trust, leading to reputational
damage that’s hard to mend.
Corporations are urged to develop robust defensive strategies to identify, contain,
and recover from ransomware and other cloud threat exploitation.
Traditional cybersecurity approaches must rapidly reshape to manage emerging men-
aces. Hence, they require new specialized and well-structured incident response plans
to become the bedrock of the security tactics.
This thesis dives into the complexities of designing and implementing accurate inci-
dent response Playbooks and Runbooks, focusing on handling the common danger
of ransomware, especially within Amazon Web Services (AWS).
This research journey is strictly connected to the real-world context, resulting from
a six-month internship within Bynder, a digital asset management leader company.
This experience culminated in conceptualizing the step-by-step procedures against
ransomware incidents in cloud infrastructures, improving communication, and coor-
dinating actions during high-pressure situations.
Keywords: Cybersecurity, Incident Response, Cloud Computing, Ransomware, Ama-
zon Web Services (AWS), Cloud Security, Playbooks, Runbooks, proactive strate-
gies
Contents
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Into the Cloud Revolution: technological transformation in
modern business . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Navigating cloud risks: confronting modern security challenges 3
1.1.3 Ransomware Menace in an Evolving Cloud Landscape . . . . . 5
1.2 Objectives and Limitations . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Preparing for the Inevitable: The Necessity of Incident Re-
sponse Planning . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Understanding the Constraints and Exclusions . . . . . . . . . 10
1.3 Bynder: Introduction to the Company . . . . . . . . . . . . . . . . . 11
1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 State of the Art 14
2.1 Cyber Security Incident Response Frameworks . . . . . . . . . . . . . 14
2.1.1 Sentinels of Security: Unveiling CSIRT Roles and Operational
Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Information Sharing Obligations in Incident Response . . . . . 17
2.1.3 Incident Handling . . . . . . . . . . . . . . . . . . . . . . . . . 18
i
2.1.4 Elevating Incident Response: The Comprehensive Six-Phase
SANS Framework . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.5 Incident Response Playbooks and Runbooks . . . . . . . . . . 26
2.2 AWS Incident Response: Best Practices . . . . . . . . . . . . . . . . . 28
2.2.1 AWS Incident Management . . . . . . . . . . . . . . . . . . . 29
3 Methodology 32
3.1 Project Planning and Design . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 Project Planning . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Incident Response Development 39
4.1 Studying the company’s infrastructure . . . . . . . . . . . . . . . . . 39
4.1.1 Risk Assessment: An In-Depth Exploration of Stages and Pre-
requisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.2 Risk Assessment: Critical Assets . . . . . . . . . . . . . . . . 42
4.1.3 Risk Assessment: Threats Analysis . . . . . . . . . . . . . . . 45
4.2 Incident Response Scenario . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.1 Designing a Ransomware Incident Response Plan . . . . . . . 47
4.3 Available Tools for Incident Response . . . . . . . . . . . . . . . . . . 49
4.3.1 AWS Incident Response Tools . . . . . . . . . . . . . . . . . . 50
5 Incident Response Implementation 57
5.1 Ransomware in AWS Incident Response Plan . . . . . . . . . . . . . 57
5.1.1 Ransomware in AWS - Security Playbook . . . . . . . . . . . . 59
5.1.2 Ransom Response for AWS S3 - Security Runbook . . . . . . 70
5.1.3 Ransom Response for AWS RDS - Security Runbook . . . . . 85
6 Incident Response Testing 98
ii
6.1 The Role of Simulation in Incident Response . . . . . . . . . . . . . . 98
6.2 Incident Response Simulation with AWS CloudSaga . . . . . . . . . . 99
7 Conclusion 102
7.1 Evaluation and Results . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.2 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3 Future Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
References 106
iii
1 Introduction
The thesis focused on developing an effective incident response plan to safeguard
the company’s infrastructure against specific threats targeting cloud technologies. It
results from a project realized during my information security internship at Bynder ,
an IT SaaS company global leader in the Digital Asset Management field and based
in the Netherlands. The introduction clarifies the background and motivation of the
thesis, focusing on the initial problem statement and its importance. Additionally,
I will talk about Bynder and the InfoSec team, providing general details about the
organization’s purpose and the security group’s goals.
1.1 Background and Motivation
The evolving digital landscape sees organizations promoting innovative technologies
that allow them to develop efficiency, scalability, reliability, and security. Innovation
enables companies to streamline operations and adapt to market demands. However,
as they embrace these new opportunities, they expose themselves to a wide range
of cybersecurity threats that can have devastating consequences. Ransomware is
among these threats, one of the most insidious and disruptive forms of attack. The
background and motivation section starts by introducing the reasons for corporations
to move towards cloud technologies, and after that, it analyzes cyber risks, narrowing
down to ransomware attacks.
1.1 BACKGROUND AND MOTIVATION 2
1.1.1 Into the Cloud Revolution: technological transforma-
tion in modern business
The launch of cloud technologies has revolutionized how businesses operate and de-
liver services. Cloud providers offer many solutions to empower their customers’
features to scale rapidly, reduce expenses, and innovate rapidly. In 2023, 94% of
companies worldwide adopted Cloud solutions to boost their services and produc-
tivity [1]. Amazon Web Services, Microsoft Azure, and Google Cloud Platform are
the most diffused solutions to dominate the market. Implementing a cloud-based
infrastructure can bring various benefits to the organization, as OCloud Solutions
explains [2]:
1. Convenience of use : The cloud can store large datasets and provide an easy
way to access them. Global presence, agility, and flexibility are the key features
that motivate companies to move their environments toward the cloud.
2. Cost reduction : Need-basis services access is one of the main reasons for im-
plementing cloud-based products. The business does not invest in on-premise
solutions because it can exploit a subscription method that minimizes any
other expense.
3. Reliability : products deployed over the Cloud can be accessed anywhere, and
most providers can ensure 99.9% uptime. Unavailability of the services is
usually caused by the company using the cloud’s misconfigurations rather than
the provider. Hence, as explained by the AWS Shared Responsibility Model,
Amazon is in charge of the Security of the Cloud, and the customer deals with
the Security in the Cloud [3].
4. Security and Privacy : Cloud systems enhance security and privacy thanks to
their features, such as encryption, access control, authentication policies, and
1.1 BACKGROUND AND MOTIVATION 3
DDoS protection. Moreover, they offer many tools to monitor the applications
and Cloud-based assets, allowing data security and real-time threat detection.
5. Collaboration : The opportunity to upload resources in the Cloud allows users
multiple accesses in real-time, making collaboration and organization on a
higher level. People can modify the documents easily, keeping the changes
under control thanks to versioning and assisting access to large files.
Nevertheless, with the growth of cloud infrastructures, a dynamic and complex
threat landscape, where cyberattacks have increased in frequency and severity, tar-
gets modern businesses. Critical operations in interconnected systems have been
digitalized and become fertile ground for threat actors to exploit vulnerabilities.
1.1.2 Navigating cloud risks: confronting modern security
challenges
While the cloud brings several benefits, it is not risk-free. Cloud providers, such
as Amazon Web Services, developed a shared responsibility model that, combined
with a dynamic cloud environment that requires expertise for its setup, may lead to
misconfiguration, vulnerabilities, and unauthorized access points.
1.1 BACKGROUND AND MOTIVATION 4
Figure 1.1: AWS Shared Responsibility Model pinpoints the customers’ responsibility
to protect their data, platforms, and applications when implementing the infrastruc-
ture. AWS is in charge of the Security of the cloud ; clients address the Security in
the cloud [3].
Cloud-based industries must be concerned about data breaches, loss of sensitive
information, and potential disruption to business operations. Multi-cloud environ-
ments are becoming increasingly diffused, and their security is challenging and re-
quires advanced skill sets and elaborated security tools. That’s why cloud service
providers are working hard to provide reliable native controls that their customers
can purchase to reduce cyber risks. Cyber security threats targeting cloud technolo-
gies are evolving, as the Checkpoint 2022 Cloud Security Report confirms [4]. Among
the top eight adversities for the cloud environment, we can find misconfiguration of
the platforms, data exfiltration, insecure interfaces, unauthorized access, account
hijacking, and, finally, ransomware. It represents a formidable adversary with a
devastating impact.
1.1 BACKGROUND AND MOTIVATION 5
1.1.3 Ransomware Menace in an Evolving Cloud Landscape
According to Anthony Today, all the big, medium, or small-size organizations are
concerned with ransomware, which targets their operations, finances, and reputa-
tions [5]. Ransomware’s first infection may come from multiple attack vectors, such
as phishing communications, download of malicious programs, or even unauthorized
access to the network. When the system is compromised, the malware can spread
across the infrastructure and encrypt sensitive information. The data remains inac-
cessible for the business until a ransom is paid. Besides the catastrophic financial
damage for the company, the brand is hit on its reputation, and customers may
lose trust in it. Sometimes, a Ransomware attack may mean the end of the business
because the provoked disruption is irreparable. These kinds of malware are classified
into Crypto-Ransomware, which encrypts data using solid algorithms until the
ransom is settled, and Locker-Ransomware, which locks the victims out of their
devices by displaying a full-screen message that cannot be removed without paying
the ransom.
Nivedita James explored the most destructive Ransomware attacks and researched
the overall financial losses of the affected corporations [6]. NotPetya evolved from
Petya Ransomware in 2017 and extorted around 10 billion dollars using the Eter-
nalBlue exploit. It was part of the Sandworm operation and is recorded as the
most impactful malware in history. WannaCry is probably the most famous ran-
somware that exploited Microsoft Windows vulnerabilities across 150 countries. It
was launched in 2017, and the loss is more than 4 billion dollars. Sodinokibi was
released in 2019 by the REvil operation and oppressed transportation and financial
sectors. The approximate losses for the affected industries are around 200 million
dollars.
1.2 OBJECTIVES AND LIMITATIONS 6
Ransomware attacks are still dangerous, and they have increased in the last year in
line with Zscaler 2023 Ransomware Report [7], mainly in the form of RaaS (Ran-
somware as a Service): tools and information for conducting the attack are sold
through the Dark Web, and affiliates can make profits by using the obtained instru-
ments. Furthermore, new predictions for 2024 foresee the combination of Artificial
Intelligence to build AI-powered ransomware attacks. Machine learning algorithms
integrated into malware can make detection and prevention more sophisticated while
allowing less expert threat actors to write malicious code.
Ransomware can, therefore, target cloud-deployed services, even with the security
measures proposed to protect the infrastructure. All companies embracing the cloud
evolution must assess the risks they may encounter and implement solutions to pre-
vent but also identify and contain ransomware attacks. Here, it’s clear the necessity
of a well-structured and effective incident response plan to help organizations against
ransomware that, although security measures are in place, can lead to catastrophic
consequences.
1.2 Objectives and Limitations
A systematic incident response plan is a crucial component of the security posture
of a cloud-based company. The publication "Cloud incident handling and forensic-
by-design" affirms the necessity of incident handling strategies for the cloud, which
is virtualized and geographically distributed. Existing models may not be adequate
to limit the impact of cloud incidents, and every company must dedicate resources
to planning [8].
1.2 OBJECTIVES AND LIMITATIONS 7
1.2.1 Preparing for the Inevitable: The Necessity of Incident
Response Planning
As reported by the NIST "Computer Security Incident Handling Guide" [9], IT
programs require a computer security incident response program since cyber attacks
have become more frequent and disruptive. Risk assessments and basic preventive
measures are no longer enough to avoid an incident, and detection and response
capabilities are thus necessary to minimize the damage, mitigate the weaknesses,
and recover productivity, returning to business. With a well-defined plan, organi-
zations may find it easier to manage an incident and lower the destruction, keeping
an excellent firm reputation. The incident response needs continuous infrastructure
monitoring, an effective technique to define whether the events are critical, and a
methodology to establish efficient and quick communication among different depart-
ments. An innovative approach to address the development of an incident response
process is by introducing scenario-based plans, which can be deployed considering
three main stages:
1. Mapping out external and internal attack vectors to identify straightforward
incident scenarios. In this stage, a risk assessment helps detect relevant threats
targeting the main resources of the company.
2. Converting the created scenarios into Playbooks and Runbooks. As I will
explain in Chapter 2, these step-by-step procedures are essential to making
technical decisions on future actions.
3. Adding new controls and pre-written scripts that responders might use. Inci-
dent detection and response may be improved thanks to new security measures
that can automate part of the process.
The implementation focused on deploying the playbooks and runbooks for Bynder’s
incident response plan, improving the ability to detect, analyze, contain, and re-
1.2 OBJECTIVES AND LIMITATIONS 8
cover from ransomware targeting cloud services. As explained by Abi Tunggal, an
UpGuard CyberSecurity expert [10], companies may not detect attacks or know how
to recover from them without a formal incident response plan. Incident response
planning based on NIST Framework for Improving Critical Infrastructure
Cybersecurity industry standards [11] brings several benefits to the organization:
• Improved Response and Recovery capabilities: The CSIRT (Cyber
Security Incident Response Team) boosts its capacity to handle incidents, and
the response and recovery actions are quicker and more effective.
• Minimized impact: The risk of compromise, data breaches, system unavail-
ability, and reputational damage is dramatically reduced with a plan in place.
The CSIRT can follow exact guidelines depending on the scenario and prevent
the incident from spreading.
• Enhanced incident analysis and containment: The incident response
process also suggests new tools to help the team with the logs analysis and the
impact containment. The effects of the events are mitigated more effectively,
and all the steps are documented to enhance the CSIRT efficiency further.
• Consistency: All the incident response stakeholders follow the same plan.
Consistency is critical to drive the response towards only one direction and
lower the impact on business operations, ensuring business continuity and
reducing financial losses.
• Enhanced Communication: The clear and structured process of the plan
leads to reduced downtime due to the rapid communication between different
teams. The communication strategies are defined for both internal and ex-
ternal stakeholders, and, in particular, the plan includes recommendations of
potential tools that can be helpful, such as Slack, Google Meet, or Microsoft
1.2 OBJECTIVES AND LIMITATIONS 9
Teams. Clear communication helps manage the situation transparently, main-
taining trust among customers, partners, and employees.
• Legal and security compliance: Companies are subject to strict regulations
regarding data breaches and incident reporting. After the incident happens,
potential data loss must be communicated within a specific time frame accord-
ing to the directives. Some security standards, such as ISO27001 or SOC2,
also expect an incident response plan.
• Clear roles and responsibilities: The CSIRT components must be defined
before the incident happens. Depending on the scenario, individuals from
different departments will be considered and follow the instructions of the IR
manager.
• Learning and Testing: Once the incident has been resolved, the CSIRT
must review all the IR stages to verify how the process can be improved.
The debriefing procedure allows the organization to identify weaknesses and
prevent similar events in the future. Furthermore, testing is fundamental
before a potential incident, and the security Runbooks provide various ways
to evaluate the plan’s effectiveness.
• Preparation for different scenarios: An exhaustive Incident Response
plan considers multiple use cases and attack vectors to cover myriad incidents.
Depending on the business classification, some resources and assets must be
prioritized.
• Risk Management: Businesses can better assess and manage the risk as-
sociated with different incidents with a well-structured plan. Therefore, this
proactive approach may minimize the likelihood and impact of the attacks.
1.2 OBJECTIVES AND LIMITATIONS 10
• Protection of reputation: The reputation is undoubtedly safeguarded by
an incident response plan. The enterprise can demonstrate its commitment to
Cybersecurity by quickly and responsively handling an incident.
Incident response planning is critical to creating a roadmap for the CSIRT to effec-
tively navigate the challenging landscape of security incidents, maintaining business
continuity, and protecting sensitive information.
1.2.2 Understanding the Constraints and Exclusions
Besides the several advantages already described, a few limitations are related to my
project at Bynder. The complexity of cloud technologies, and in particular, Ama-
zon Web Services, and the rapid evolution of cybersecurity threats posed certain
constraints on the scope of this study. First, AWS is a multifaceted cloud platform
that offers a vast selection of services and configurations. Hence, this thesis cannot
encompass every Amazon resource, scenario, or potential threat. Instead, it focused
on a subset of significant use cases to illustrate the principles of incident response.
Furthermore, cyber threats continuously evolve, with new attack vectors, malware
variants, and vulnerabilities emerging regularly. New ransomware may arise during
the lifecycle of this thesis, and it might trick the exposed security solutions.
Finally, my limited period spent within Bynder posed relevant limitations to this
project, starting from the methodology. An assessment is critical to evaluate the en-
vironmental risks and formulate security solutions. However, a complete evaluation
was not part of the scope, and it focused on specific technologies. Another required
phase of incident response is testing, which was part of the initiative. Unfortu-
nately, simulations were too challenging due to the complex nature of the project
and the effort needed to prepare the environment and mimic the attacks. Never-
theless, they were planned before I left the organization, and the incident response
1.3 BYNDER: INTRODUCTION TO THE COMPANY 11
scenarios will be tested, as clarified in Chapter 6. In conclusion, incident response
planning is a never-ending process. Every time changes are applied to a specific
product, new risks and threats may be considered, and new tools or different kinds
of analysis may be integrated to improve security.
1.3 Bynder: Introduction to the Company
Even though it is not critical for understanding the thesis, I would like to introduce
the company of my internship: Bynder. This section can help the comprehension of
the background motivation and implementation choices.
Bynder is a leader in Digital Asset Management based in the Netherlands with a
platform that provides the fastest features to manage creative files. It is a SaaS
(Software as a Service) company that allows other business teams to collaborate
with easy file sharing and storage and a central hub for all digital assets. Bynder’s
products use Cloud technologies, and, in particular, Amazon Web Services had
a critical role in developing the business and speeding innovation. As explained by
AWS, one of Bynder’s top priorities was international expansion, which was made
achievable through the utilization of Amazon Simple Storage Service (S3), Ama-
zon Relational Database Service (RDS), or Amazon Rekognition [12]. AWS enabled
Bynder to become one of the most competitive Digital Asset Management solutions
on the market.
Hence, Bynder is an innovative company growing internationally and fast, and the
presence and support of an Information Security team supervising security and
compliance topics is essential. My internship of six months occurred within this
critical unit to help deliver a secure and trustworthy Bynder brand by maintaining
and further enhancing a resilient information security program. The ambition to
1.4 THESIS OVERVIEW 12
improve the incident response plan was born from a joint idea between the Infor-
mation Security department and the AWS Customer Support team. Implementing
new incident response plans is vital to further maturing Bynder’s overall security
posture, ensuring business continuity, and decreasing the impact of cyber threats.
1.4 Thesis Overview
The thesis is divided into eight chapters describing incident response State of the
Art at first and then focusing on design and development. The final paragraphs
explain the results of the plan and introduce future opportunities.
Apart from the introduction, there are the remaining sections:
2 State of the Art: This initial chapter will provide an overview of the inci-
dent response state-of-the-art, mainly concerning Amazon Web Services Cloud
Computing since it is at the base of Bynder’s products. I will analyze the Cy-
ber Security Incident Response transition, starting from the NIST publication
800-61 Rev 2 (Computer Security Incident Handling Guide).
3 Methodology: This section will explore the arrangement of the project. It
will outline its design, including charts and tables used to prepare the tasks
and define the deadlines for each stage.
4 Incident Response Development: Incident response development includes
the preliminary activities before implementing security Playbooks and Run-
books. I started with a risk assessment to identify the critical assets for Bynder
and the most dangerous threats that may target a cloud-oriented enterprise.
I also studied the available tools that may be used for incident response to
understand the defensive capabilities of an organization to limit the impact in
case of an incident.
1.4 THESIS OVERVIEW 13
5 Incident Response Implementation: Implementation is the core of this
document since it contains the actual security Playbooks and Runbooks cre-
ated during my internship. The procedures include Preparation, Identification,
Containment, Eradication, Recovery, and Lessons Learned, even though each
one has been reviewed to clear out the sensitive information of Bynder’s final
plan. These blow-by-blow documents resolve the problem by presenting how
to face ransomware attacks.
6 Incident Response Testing: This chapter is dedicated to the simulations
that may be carried out to improve the IR processes further. The attacks’
replicas can be performed using multiple tools, such as AWS CloudSaga, and
they can reproduce one or more analyzed scenarios. I will describe how to
conduct infrastructure tests even if there was no time to conclude the testing
during my last period at Bynder.
7 Conclusion: The conclusion includes evaluation and results, final remarks,
and future trends for incident response. During the internship, I also created
an IR template that can be revised for new scenarios to cover more use cases
than the ones considered in this thesis. Considering the fast evolution of
cybersecurity threats, the framework can be helpful for the organization.
2 State of the Art
State of the art refers to the highest level of advancement in a particular field, and
this chapter analyses the current progress of the technologies surrounding the inci-
dent response area. According to the paper "On Incident Handling and Response:
a state-of-the-art approach" by the researchers Mitropoulos, Patsos, and Douligeris
[13], responding to an incident involves many management, legal, technical, and
social aspects. I will compare the initial proposals elaborated by NIST (National
Institute of Standards and Technology ) with other solutions, and I will introduce
playbooks and runbooks, talking about how they can be helpful during the reme-
diation actions. Eventually, I will explain the best practices while developing an
incident response plan, discussing how AWS suggestions may mitigate threats tar-
geting cloud-driven companies.
2.1 Cyber Security Incident Response Frameworks
Incident Response has always been a hot topic in Information Security. All the
devices are currently connected to the internet, from servers to personal laptops
and IoT, and all organizations must now develop an effective incident response
plan to avoid infrastructure compromise and ensure business continuity. In 2012,
NIST published the first crucial paper dealing with Incident Response and provid-
ing insights and framework companies should follow to handle security incidents:
Computer Security Incident Handling Guide (NIST 800-61). Even though the
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 15
publication is dated and was written when some technologies were mere imagina-
tion, it is still valid and used to mitigate the risks by implementing efficient response
actions.
A Computer Security Incident is defined as a "violation or an imminent threat
violation of computer security policies, acceptable use policies, or standard security
practices" by Paul Cichonski (NIST ) [9]. It involves different parties that interact
with each other:
1. An attacker : the malicious party that tries to trick the user(s) or exploits the
vulnerabilities of the infrastructure.
2. The infrastructure, which includes servers or other devices, is compromised by
the attack.
3. Optionally, one or more users that the malicious entity may deceive.
When security breaches occur, the business must be ready to respond rapidly and
competently. The company’s approach should be well-defined and organized, and
the incident response plan must be a roadmap on which the incident response capa-
bility is built. Therefore, the initial program designed by NIST included the goals of
the procedure, the incident response strategy with internal and external communi-
cation approaches, some metrics to evaluate the current process that is in place, and
guidelines for maturing the response capability. Incident response requires strategic
planning that must be discussed among different departments because they may
have an alternative perspective on the same argument. Furthermore, the tactics
should be reviewed annually to protect the business against innovative threats and
sustain its efficacy.
Incident response also needs several resources to implement security mechanisms
and measures, which may increase the organization’s expenses. As a result, the cor-
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 16
poration should evaluate which actions are essential for the operations and balance
the costs and the integration of new defensive techniques. An excess of implemented
technologies may result in huge prices, although the company is well-protected.
Hence, it’s imperative to design opportune procedures and protocols.
2.1.1 Sentinels of Security: Unveiling CSIRT Roles and Op-
erational Tools
The plan’s first step implies establishing a Computer Security Incident Response
Team, or CSIRT. This group will be organized with a team manager overseeing the
operations. This individual is vital to control the execution of the plan and ensure
that the team has the required personnel and knowledge. On the other hand, the
rest of the CSIRT must have the appropriate skills to manage the criticality of the
incidents, such as system and network administration, investigation, malware anal-
ysis, and problem-solving. Moreover, the CSIRT should have the proper technology
to detect, analyze, and mitigate the incident. As documented by the NIST publi-
cation 800-83, "Guide to Malware Incident Prevention and Handling for Desktops
and Laptops" [14], several security tools can be used for vulnerability and threat
mitigation:
• EDR and XDR: Endpoint Detection and Response (EDR) focused on mon-
itoring and responding to security threats at the endpoint level, providing
real-time visibility into endpoint activities, including processes, files, network
connections, and user behavior. Extended Detection and Response (XDR) ex-
pands the scope beyond endpoints to cover various security layers throughout
the infrastructure. XDR includes information from multiple sources, such as
Cloud and application services, to recognize more complex or multi-stage at-
tacks. Additionally, these technologies can directly act on the devices, such as
network isolation or file deletion, helping the overall responding process.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 17
• IPS: Intrusion and Prevention Systems can be categorized by Host-based IPS
and Network-based IPS. The first product stops suspicious activities by mon-
itoring a single host, while the second analyzes the organization’s internal
and external network traffic. Network-based IPS may detect many malicious
events, and the administrator can customize its rules with new malware sig-
natures.
• Firewall: This prevalent device filters the incoming and outgoing traffic of the
organization. They provide numerous rules to be modified, and, as NIST rec-
ommends, firewalls must be configured with deny-by-default rulesets. WAF
(Web Application Firewall) is now diffused across corporations and is designed
to protect web applications from assorted online threats.
• SIEM: Security Information and Event Management It is a comprehensive
solution that collects, correlates, and analyzes data from multiple security
sources. Logs and event data are aggregated in real time to identify patterns
and anomalies that can be correlated to potential security threats.
• Defensive Architecture: The infrastructure’s architecture can dramatically
reduce the impact of incidents. For instance, Sandboxing runs the applications
within a controlled environment, and Browser Separation allows the users to
access corporate applications using only a single brand of browser.
Although personnel and technologies are integral components of the response frame-
work, NIST introduces another topic that should not be underestimated.
2.1.2 Information Sharing Obligations in Incident Response
Another sensitive matter the NIST 800-61 publication explores is information shar-
ing with outside parties. As reported by the General Data Protection Regula-
tion art. 33, the data controller must notify the supervisory authority within 72
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 18
hours in the case of a data breach [15]. The communication should include the data
subjects and the number of records involved in the violation and describe the possi-
ble consequences of the loss. The CSIRT shall incorporate one or more individuals
from the legal department to establish policies regarding information sharing. Once
the team compositions and the tools are ready, the Incident Response plan can be
prepared, starting from the NIST Framework.
2.1.3 Incident Handling
NIST’s early Computer Security Incident Handling Guide has set the groundwork
for all the current Incident Response plans. Four main phases were part of the initial
structure.
Figure 2.1: The NIST Incident Response Life Cycle is divided into four stages:
Preparation, Detection and Analysis, Containment, Eradication, Recovery, and Post-
Incident Activity.
The first important step is Preparation, which establishes and trains the team
and sets up the required resources. Usually, the security department performs risk
assessments to select the most critical assets and, lately, implements preventive con-
trols to avoid incidents. Since the residual risk is inevitable, the Detection and
Analysis will continuously seek threat traces, allowing for a prompt response. The
security team receives alerts from the monitoring tools, and it can contain the im-
pact of the incident.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 19
Containment, Eradication, and Recovery limit the damage, remove the Indica-
tors of Compromise (IoCs), and let the corporation return to business by restoring
the unaffected version of the environment. The CSIRT should provide some di-
rectives to circumvent similar attacks in the future, and new controls should be
added to the current infrastructure. The Post-Incident Activity is helpful to
verify that all IoCs have correctly been removed and how future malicious events
can be avoided. The CSIRT prepares a report that focuses on the causes of the
violation and analyzes the team’s behavior. However, although the NIST procedure
is straightforward and apparently without gaps, it can be improved by expanding
these four stages and offering more detailed instructions.
2.1.4 Elevating Incident Response: The Comprehensive Six-
Phase SANS Framework
Expanding the NIST strategy leads to a second comprehensive incident response
framework, commonly known as SANS. The SANS Institute Incident Handler’s
Handbook by Patrick Kral [16] is recalled by many cyber security experts. One of
them is the Incident Response leader at Cynet Security, Asaf Perlman [17], who
explains the SANS’s six steps for a complete modern IR approach: Preparation,
Identification, Containment, Eradication, Recovery, and Lessons Learned.
Incidents are a matter of when, not if, a violation will happen.
Figure 2.2: SANS Incident Response consists of more steps that CSIRTs should
follow in sequence, as each is built on top of the other.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 20
Preparation
Preparation and Identification are the steady phases since the organization is always
preparing for a security incident. The team must be ready to handle an incident that
can vary from anything, like a blackout, to something extreme, such as ransomware.
SANS Incident Handler’s Handbook lists some key elements that must be applied
to the infrastructure to mitigate the risks:
1. A well-defined security policy is the business’s primary goal because it com-
prises a set of rules and principles that define how to operate within the or-
ganization. Without a clear policy, employees are free to behave according to
their inclinations, potentially making the company vulnerable to the outcomes
of their conduct.
2. Access control systems should be in place to ensure the Need-to-know secu-
rity principle. Microsoft expert Andreas Wolter mentions that users should
access only the information they need for their job functions [18]. There are
multiple ways to implement this rule, and the business must guarantee that
the CSIRT has the proper permission to manage potential incidents.
3. Monitoring tools are critical resources that must be available for CSIRT. All
the helpful technologies must be accessible before, during, and after the ma-
licious event. They should be able to analyze the data sources because they
are crucial for detection and investigation.
Figure 2.3: The ease of investigation is directly proportional to the number of sources
at hand, and saving data from logging will add costs to the incident response process.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 21
4. Since human error continues to be a contributing factor for security breaches
and the average annual cost of incidents caused by human error is $3.36 mil-
lion, as reported by the study of Katherine Amoresano and Benjamin Yankson
[19], periodic training is fundamental to reducing the likelihood of an inci-
dent. Higher educational institutions must invest enough resources and time
in educating their employees. The incident response team should be quali-
fied to manage the alerts and potential infrastructure violations. At the same
time, the rest of the organization must be informed of new security trends and
attackers’ techniques, such as phishing.
5. Escalation and the point of contact in other departments should be predefined.
A clearly stated communication plan leads to a faster response, and the CSIRT
is unlikely to contact the wrong people who may not know how to solve the
issues.
6. Last but not least, all the systems must be backed up to facilitate restoring
the production environment after an incident happens. The company risks a
massive loss without a backup strategy since it should rebuild the infrastruc-
ture from scratch. Indeed, security incident response that is not supported by
proper backups may imply the end of the business in the worst-case scenario.
This thesis explores AWS, which provides multiple tools that can help companies
with the preparation phase.
Identification
In this stage, monitoring tools and the security team analyze unexpected and anoma-
lous behavior to detect potential incidents within the infrastructure. Identification
then depends on the services that should be protected and the available tools have
been deployed during the preparation phase.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 22
Logs are collected from multiple sources, such as IPS, firewalls, cloud technologies,
and so on, to determine possible deviations from the expected operations of the
users or the services. Monitoring tools must report events recognized as incidents
as soon as possible using suitable notification systems, and the communications
must include all the required information to make the investigation easier. The re-
sponding team should be able to reconstruct the malicious actions that have been
performed considering the questions: Who, What, When, Where, Why, and How.
Suitable identification mechanisms may detect internal and external irregular events
by minimizing alert fatigue. Indeed, frequent alerts about threats can desensitize
cybersecurity experts to accurate warnings, as written by E. Segal [20].
Figure 2.4: The organization should find the perfect balance between detecting too
much or too little, considering all the significant events, and avoiding alert fatigue.
The Identification step can provide indispensable insights into the impacted re-
sources and the bad actor’s initial attack vector. Once the incident has been de-
tected, the CSIRT must create an adequate communication channel and immediately
proceed with the Containment. The team shouldn’t focus on the investigation at
this stage to avoid the spread of the incident that may affect other business assets.
Containment
The goal of the Containment is to minimize the damage of the incident. Different
stakeholders can approach this step with conflicting ideas: for instance, the secu-
rity perspective suggests isolating all the compromised machines immediately, while
the business point of view prefers to keep the device reachable and running, some-
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 23
times without fixing the vulnerabilities because the services of the company cannot
be paused or terminated. All the infected devices must not be turned off
because they may contain vital information for the investigation and response activ-
ities. Furthermore, CSIRT must prioritize the critical assets, figuring out whether
they have already been affected or are at risk.
The SANS Incident Handlers Handbook differentiates Short-term Containment, which
is not a permanent solution to the problem but the first operation to limit it, and
Long-term Containment, which allows fixing momentarily the system to be used in
production. I didn’t consider this separation while developing the plans since com-
panies would need to uproot the problem as soon as possible because recent malware
or infections can spread rapidly and without any control. Potential compromised
accounts or backdoors should be removed instantly to block any further action of
the attackers. Once the Containment is concluded, CSIRT should have restricted
all bad actors’ activities, and the business can move on with the Eradication.
Eradication
Eradication includes the investigation to understand deeply the incident and the
removal of all the IoCs from the environment. Inspecting the events related to
the bad actor’s activities may determine the initial attack vector and any possible
attempt the attacker made to maintain access to the infrastructure. During this
process, CSIRT must ensure that all the proper actions were taken to delete all
the malicious contents from the affected systems, and they have to figure out the
overall impact on the business. Investigation can only start if the incident has been
contained, and suitable tools for this step should be in place, such as agents on the
endpoints that collect a lot of artifacts without the team needing forensic copies
of the devices. Obviously, the logs and information examined may vary depending
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 24
on the compromised technologies. The digital forensic incident response makes the
process faster because responders are not forced to go personally to the location of
the incident.
The final result of Eradication’s first part should be documented to catalog all the
Indicators of Compromise and how it was possible to find them. CSIRT should
restore any system that may contain malware, preventing reinfection. However,
the team can’t directly remove all data from the machines in light of a possible
violation of the organization’s policies. Incident documentation may help to fix
any vulnerability by adding new security measures to improve the defenses, while
network and appliance scans help to detect remaining anomalies. Before proceeding
with the Recovery, it’s essential to guarantee that the incident’s root cause has been
eradicated and no IoCs are still present within the infrastructure.
Recovery
The goal of Recovery is to return to business by bringing all the target apparatus
back to the production environment. Nevertheless, all the stakeholders should pay
attention because, without the required precautions, a too-early recovery may lead
to another incident. The systems must be tested, monitored, and validated, and
when possible, the affected software must be destroyed and substituted with a secure
version from the backup. As mentioned in the Preparation section, backups must
be performed before the incident happens because, without them, the worst-case
scenario may lead to the end of the business. CSIRT should agree on the methods
and the time to restore the operations and new monitoring tools that can detect
unexpected behavior of the recovered appliances. The primary purpose is to avoid
another security incident from happening due to the same or similar vulnerabilities.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 25
Lessons Learned
The last incident response phase is Lessons Learned, a review of all the response
procedures. CSIRT should have documented all the steps to provide a complete
overview of the events and the activities and to answer the usual questions Who,
What, When, Where, Why, and How. Each step of the responding process will be
analyzed to determine the initial vulnerabilities that led to the incident and what
can be improved to react faster and more effectively, reducing the overall damage.
Lessons learned should focus on evaluating the team’s performance and drafting
clear documentation that will also be useful in the future as training material. The
review is carried out within the team by organizing multiple post-mortem meetings
that will finalize the response process. For example, CSIRT can examine some of
the following questions:
1. "How long was the incident detected after the initial compromise? Do we need
better tools for Identification? ". The team will assess the monitoring and
detection mechanisms to understand if and how they can be improved.
2. "How long did Containment take and why? " Containment is vital to minimize
the loss of the business, and it should be as quick as possible.
3. "After Eradication, did sign of malware or compromise still appear? ". CSIRT
can study the effectiveness of the Investigation and IoCs removal and get the
hang of new techniques to enhance the capabilities.
4. "Were the affected resources correctly restored? ". The team will review the
Recovery phase by considering better backup solutions if necessary.
Sharing ideas is highly beneficial for the security department and the whole com-
pany, and Lessons Learned can really help against future incidents.
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 26
In conclusion, incident response is a complex set of operations requiring management
and as many data sources as possible to make the tasks and the investigation easier.
Indeed, modern approaches expect the integration of playbooks and runbooks, which
allow CSIRTs to follow specific guidelines depending on the scenarios and the role
of the individuals dealing with the incidents.
2.1.5 Incident Response Playbooks and Runbooks
As reported by S. Gatlan [21], playbooks and runbooks can standardize the response
process to lower the impact of an incident in both the private and public sectors.
Security playbooks and runbooks are so important for the organizations that they
were also included in The White House Executive Order to improve the US Cyber-
security (section 6) signed by President Biden [22]. They can guarantee a common
understanding of security incidents and define the instructions to identify, remedi-
ate, and recover from threats and exploitations targeting the systems.
In general, playbooks and runbooks are comprehensive documents that outline pro-
cedures with the same goal of responding to cybersecurity incidents within an or-
ganization. They can be deployed facing multiple use cases impacting the business,
making it easier to test out realistic attack simulations to verify whether the team’s
approach may work. The below sections explain their main differences, pointed out
by the AWS senior security strategists Nathan Case and Paul Hawkins during the
2019 Amazon Web Services re:invent conference [23].
Security Playbook
The Incident Response Playbook provides an overview of situational responses and
planning for the future. They’re critical for the business because Playbooks include
non-technical people by reporting C-level or UP-level information and can be seen
2.1 CYBER SECURITY INCIDENT RESPONSE FRAMEWORKS 27
as RACI (Responsible, Accountable, Consulted, and Informed to set the roles and
responsibilities during the IR procedures. A Playbook can align all stakeholders
on the same process, and they can generally understand how the IR team is treat-
ing the incident and which countermeasures it’s been applying. Avi Shaked, Yulia
Cherdantseva, Pete Burnap, and Peter Maynard point out the importance of cyber
security playbooks because they can establish an effective response capability within
the organization. They explain that the main issue of playbooks can be addressed
as the lack of communication ability [24]. Hence, the implementation section also
aims to cover all the problems highlighted by the journal.
Security Runbook
Similarly to playbook, Incident Response Runbook provides a tactical review of a
situation but includes technical information to correct and investigate an incident.
Runbooks contain strategic planning for the future, and they may be helpful for the
individuals of the CSIRT: they follow specific steps for security scenarios to enact
desired outcomes. Usually, Runbooks enclose a contact list for every situation and
can also be part of testing when a new architecture is developed. For instance, ev-
ery change to the infrastructure may lead to further vulnerabilities that should be
examined while writing an incident response plan.
To summarize, new incident response designs are required to incorporate playbooks
for the high-ranking executives of the organization and runbooks for the CSIRT
that is in charge of managing the actual response activities. After introducing the
standard incident response frameworks, I will focus on the cloud infrastructure,
considering the AWS best practices that can be put in place to defend against the
most severe threats.
2.2 AWS INCIDENT RESPONSE: BEST PRACTICES 28
2.2 AWS Incident Response: Best Practices
Deploying an operating and reliable cloud architecture to safeguard business data
and appliances is crucial. AWS offers frameworks and services to improve the secu-
rity posture and reduce the risks of customers’ platforms. In particular, the AWS
Well-Architected Framework helps organizations to create efficient and cost-
effective multi-account environments that consider all the vulnerabilities exposed
to cloud threats [25]. To be prepared against security events, the AWS framework
security pillar suggests multiple architecture design principles while launching the
cloud environment:
• POLP : The principle of Least Privilege restricts the interactions between users
and resources according to their needs. Separation of duties allows to manage
the authorizations to specific AWS assets.
• Infrastructure Monitoring : Controls and alerts qualify customers to trace real-
time environmental changes.
• All Layers Security : Every AWS instance (Virtual Private Cloud, load bal-
ancers, network) is monitored without exception. AWS native tools provide
360 degrees of monitoring of the infrastructure.
• Security Automation : Security countermeasures automation is critical for scal-
ing rapidly and cost-effectively. Customers with automated techniques in place
can also respond quickly to security events that usually require a manual in-
vestigation.
• Data Encryption : Encryption in transit and at rest enables data protection
for confidentiality and integrity.
• Restricted Data Direct Access : AWS tools enable reducing manual information
access or processing, reducing the risk of human errors.
2.2 AWS INCIDENT RESPONSE: BEST PRACTICES 29
• Incident Management : Preparation for security incidents is one of the AWS
best practices since it is strongly recommended to have incident management
and investigation policies in place. Detection, analysis, and recovery proce-
dures can improve by simulating security events and trying to stop the attacks.
Moreover, customers must get the full picture of the already mentioned AWS Shared
Responsibility Model, which shows Amazon’s limitation in protecting the infrastruc-
ture. Although users may assume that their services are 100% protected once they
subscribe to the cloud, they must deploy some solutions to safeguard the information.
AWS is only responsible for the infrastructure that permits services to run, while
customers have to deal with the management and configurations of the instances
and secure the platforms and applications. Understanding the design principles
helps to identify the primary defensive objectives for customers deploying services
in the cloud. Clients can prepare the infrastructure for any inconvenience only with
a precise idea of these best practices.
2.2.1 AWS Incident Management
Amazon Web Services’ best practices involve an incident response plan, even though
the organization purchased all the most innovative and expensive preventive controls.
AWS model promotes a few goals that the incident response plan should design.
Education must be considered a priority. The CSIRT should be able to supervise
a wide range of incident scenarios. They must have developed programming and
investigation skills to lead the response process and recover from the compromise.
Each specialist should have advanced craftworks in AWS cloud technologies and
a complete knowledge of the services and applications running in the cloud. For
instance, they may quickly identify the incident’s root cause, whether they know
which service released the logs, which is their meaning, and which is the traffic flow
of the applications.
2.2 AWS INCIDENT RESPONSE: BEST PRACTICES 30
The team can acquire the required skills by running simulations of the incidents.
Simulate is a critical part of the strategy since it spawns unexpected security events
to test the preparation of the personnel. These practices may vary from Blue Team
to Red Team exercises with a squad defending and the other attacking, and de-
mand specific infrastructures to be deployed before the tests because the business
can’t risk compromising the production environment. Last but not least, one of
the key AWS stages is preparation : CSIRT must have access to the appropriate
tools to manage the incident and restore the previous unaltered versions of the sys-
tems. While preparing the architecture, the security department should assess the
assets and identify the key personnel and resources that may be helpful during the
response. Amazon Incident Response team suggests implementing playbooks and
runbooks that bring several benefits by establishing transparent workflows to answer
promptly.
The Playbook Framework underlined in the next section includes the guidelines for
selecting the CSIRT individuals, roles, and responsibilities and explaining how to
communicate during an incident without disclosing sensitive information. It taught
the team how to classify the severity of incidents and how to prioritize the assets.
AWS Playbook Framework
Every customer utilizing AWS Cloud services can take advantage of the Playbook
Development Guide that addresses multiple threat scenarios [26]. The framework
is available in Github’s repository, and security teams can create and integrate
Playbooks for diverse attacks. Every playbook’s title must encapsulate the scope of
the document, emphasizing which use case it is covering and including an author
and an approver assigned to the project. The playbook should be categorized with
the date and the version number to monitor all the changes.
2.2 AWS INCIDENT RESPONSE: BEST PRACTICES 31
After an Executive Summary that summarizes its goals, AWS suggests building a
clear-cut structure by analyzing the following points:
• Threat and Response Steps : The author should focus on the threat’s descrip-
tion and the potential impact on the current infrastructure. Afterward, the
writer can define step-by-step guidelines that will be used to respond chrono-
logically to the security events.
• Incident Classification and Tools : Similar events will be classified with a crit-
icality for the business depending on the compromised assets and the exposed
sensitive information. Each playbook must list all the tools to detect, contain,
investigate, and recover from the incident. AWS provides several native cloud
technologies for this purpose.
• Incident Handling : This stage can drastically change depending on the ana-
lyzed threat and the selected framework. Incident handling starts from the
Preparation and concludes with the Lessons Learned during the post-mortem
activity. The structure of the playbook must facilitate CSIRT’s arrangement
and planning.
Security playbooks will be used only for general scenarios, enabling communication
between C-level stakeholders of the organization. I redesigned the structure of the
security Playbooks to create security runbooks to drive the technical responders
against significant threats. Combining the information collected from the SANS
and AWS frameworks, it is now possible to comprehend the successive chapters and
the reasons for the implementation choices. Incident response is a broad field; every
company must have plans to prepare and defend against future criticalities.
"If anything can go wrong, it will" (Murphy’s Law, 1949).
3 Methodology
This chapter is dedicated to project design, focusing on the scope, milestones, and
deadlines. It displays how to properly organize the tasks and the necessary stages
to design an incident response strategy for specific scenarios. Bynder had already
created an incident response plan based on the NIST publications before the begin-
ning of my internship. On the other hand, playbooks and runbooks were necessary
to cover all the critical scenarios of potential threats targeting the organization’s
cloud services.
3.1 Project Planning and Design
To reach the desired outcomes, I adhered to the established Information Security
team’s methodology, named Agile. This approach emphasizes flexibility and collab-
oration by breaking the project into smaller iterations called sprints, which usually
last only two weeks. Agile was perfect for collecting feedback from the team and
adjusting the plan accordingly. The described methodology allowed me to divide
the project into several activities outlined by the Milestones of this section. The
quantitative analysis of Agile, carried out by Pedro Serrador and Jeffrey K. Pinto,
demonstrated that this methodology is one of the best because it usually brings
success [27].
3.1 PROJECT PLANNING AND DESIGN 33
3.1.1 Scope
The project’s scope has been discussed with the information security team before
the plan of action.
Must Have
1) Playbooks provide a general overview of the incident
response procedures and ensure all the stakeholders are
aligned on the overall strategy.
2) Runbooks provide detailed guidelines to respond
to specific incidents and ensure CSIRT preparation for
threats.
Nice to Have Simulations may identify potential issues with the IR
procedures and help the CSIRT improve its incident-
handling skills.
Not in Scope 1) A complete Risk Assessment wasn’t required to
identify the IR use cases.
2) Monitoring tools were already in place, and it
wasn’t necessary to analyze new technologies.
Table 3.1: The table highlights the "must have" that clearly describes the require-
ments, the "nice to have" that denotes functional tasks to improve the quality, and
"not in scope", which includes the activities that are not needed to complete the
program.
3.1.2 Project Planning
Project planning included the definition of some milestones that must have been
fulfilled before the imposed deadlines. A project owner was responsible for super-
vising each phase, and the stakeholders must have been informed about the progress.
3.1 PROJECT PLANNING AND DESIGN 34
The milestones were matched with strict deadlines and a status that could be Not
Started, In Progress, and Done. The plan should have been manually updated to
report all the relevant exercises, and eventually, a Gantt chart assisted with the
project planning and management, providing an overview and scheduling of all the
tasks.
The initial project scheme was designed considering the Problem Statement and
the SANS Incident Response framework composed of the six steps for a complete
IR procedure. NIST recommends that information technology programs within
organizations should be first generally prepared for any incident before moving on
and reflecting on typical attack vectors. Since Bynder’s general plan was already in
place, I could focus directly on the use cases.
Milestone Owner Deadline Status
IR Project Definition @InfoSec 1 March 2023 Done
Risk Assessment @Samuele Gandini 15 March 2023 Done
IR Scenarios Definition @Samuele Gandini 1 April 2023 Done
Playbooks @Samuele Gandini 1 June 2023 Done
Runbooks @Samuele Gandini 15 June 2023 Done
IR Simulations @Devops 1 July 2023 In progress
IR Finalization @InfoSec 15 July 2023 Done
IR Review @InfoSec 1 August 2023 In progress
Table 3.2: The table incorporates all the steps from the infrastructure analysis to
the finalization of the project.
3.1 PROJECT PLANNING AND DESIGN 35
Each stage included multiple sub-tasks to satisfy the prerequisites of the Agile
methodology. The assignments were expressed as follows:
1. Incident Response Project Definition
• Define the scope and the steps : The first part of the project includes only
the definition of its scope and steps to match the Agile methodology.The
owner was the Information Security team because we scheduled multiple
meetings to align on different topics.
2. Risk Assessment
• Identify the critical assets : Risk assessments always start with the in-
frastructure analysis to determine which assets are more at risk. The
evaluation focused on the cloud inventory and the likelihood of its viola-
tion. The result pointed out the most important systems for the business
and enabled mapping out the attack vectors.
• Gathering information about the available tools for incident response :
While studying the company, I had to figure out the known mechanisms
that may be useful to respond to any potential attacks. I collected in-
formation through Bynder’s internal documentation and platforms, espe-
cially the AWS management console.
3. Incident Response Scenario Definition
• Identifying the principal threats : Once the critical assets have been de-
tected, I researched the main threats that may jeopardize cloud technolo-
gies.
• Mapping out the attack vectors : This essential stage converted the mul-
tiple external and internal attack vectors into use case scenarios covered
in the procedures.
3.1 PROJECT PLANNING AND DESIGN 36
4. Playbooks
• Creating Playbooks : These procedures provide a general overview of the
scenarios by selecting a category of attack vectors. Thanks to the Play-
books, all the stakeholders are aligned on the overall strategies.
• Writing Playbooks’ steps : Playbooks should be uploaded among the com-
pany’s internal procedures, such as the internal wiki, and they are struc-
tured following the six stages of the SANS framework (Preparation, Iden-
tification, Containment, Eradication, Recovery, and Lesson Learned).
Every playbook includes an Overview justifying why the threat class was
chosen.
5. Runbooks
• Runbooks creation : The use cases are converted into runbooks, the tech-
nical guidelines for the Computer Security Incident Response Team. Its
members use their skills and the available tools to contain the events and
restore the previous situation.
• Writing Runbooks’ steps : As for the Playbooks, Runbooks provide a
whole procedure involving Preparation, Identification, Containment, Erad-
ication, Recovery, and Lessons Learned. The post-mortem activities are
similar for most scenarios, although they were repeated in each runbook
to build comprehensive directives for the team. The initial summary
briefly clarifies the potential impacted resources and other information
(insider threat, endpoint, credentials breach, malware, etc.).
6. Incident Response Simulations
• Researching for testing strategies and tools : The security division must
test the methods’ effectiveness by running simulations. The assessment
3.1 PROJECT PLANNING AND DESIGN 37
strategy must not condition the production branch; it may consist of
virtual environments, for example. Furthermore, they must look for sim-
ulation tools to speed up the process.
• Run incident simulations : The Information Security unit can launch spe-
cific simulations to analyze CSIRT behavior actively. Hence, InfoSec can
collect data about their activities and the skills that may be valuable
during an attack.
• Evaluate and upgrade the runbooks : Data collected from the previous
step can be used to verify the processes’ effectiveness and efficiency, and
InfoSec can highlight any issue or gap to improve their quality.
7. Incident Response Finalization
• Assessing the outcomes : The outcomes are shown during an incident re-
sponse review meeting with Information Security and other individuals
from IT, DevOps, Legal, and AWS Customer Support groups. The pre-
sentation includes final remarks and some opportunities for growth by
promoting new detection mechanisms and simulation tools.
8. Incident Response Review
• Review the current scenario: Information Security must periodically ver-
ify that the created scenarios are valid, considering the business infras-
tructure and potential new threats.
• Evaluate new playbooks and runbooks : InfoSec should create new Incident
Response use cases to cover as many security events as possible. Tem-
plates for playbooks and runbooks may be prepared for future purposes.
3.1 PROJECT PLANNING AND DESIGN 38
Figure 3.1: The Gantt chart helped visualize the project’s various periods, start-
ing from the beginning of February 2023 and ending six months later. Playbooks,
runbooks, and incident response simulations had overlaps because the process was
iterative, and one phase depended on the other.
Methodology gave input for the planning technique for defining incident response
plan scenarios. The subsequent chapters focus on the incident scenario "Ransomware
in AWS", showing how to conceive the stages from risk assessment to incident
response finalization.
4 Incident Response Development
This chapter shows the preparatory steps required to write the playbooks and run-
books for the incident response plan. At the basis of the incident response strategy,
we find the study of the current infrastructure, the risk assessment to recognize the
critical assets and the most dangerous threats, and the selection of the tools to in-
tercept and remediate incidents. This section describes the weaknesses of the target
systems, leading to an evaluation that was required to craft the use case scenarios.
Even though Bynder’s cloud architecture was the target of this reflection, sensi-
tive information will not be included, maintaining a general approach to incident
response planning.
4.1 Studying the company’s infrastructure
Exploring the architecture of a new company can be challenging. It would be best to
familiarize yourself with the department, operations, and technologies before eval-
uating the vulnerabilities and threats for specific systems. Hence, risk assessment
is a complex process that demands a comprehensive understanding of the business,
research, and strategic arrangement.
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 40
4.1.1 Risk Assessment: An In-Depth Exploration of Stages
and Prerequisites
According to the NIST Guide for Conducting Risk Assessment (Publication 800-
30), this methodology informs decision-makers about identified threats to the orga-
nization, internal and external vulnerabilities, the impact of threats exploiting the
detected exposures, and the likelihood of the harm occurring [28]. Crossing these
elements, we can obtain a crucial risk value (impact X likelihood ) that helps deter-
mine which security measures we can apply to lower either or both the impact and
the probability. The process is classified into three main steps that produce a final
document containing the outcomes:
1. Assessment Preparation : The first stage focuses on identifying the scope of
the assessment and choosing a proper risk model. Some related activities may
be the definition of ways the risk can be accepted or rejected, how to en-
able information sharing, or highlighting any question relevant to the purpose
of the evaluation. Before moving on, the security team must have declared
assumptions and constraints for the assessment.
2. Conducting the Assessment : This phase assigns a risk level to the organiza-
tion’s assets. It starts by enumerating the company’s critical assets of the
selected landscape. Considering the scope of the incident response project, I
prioritized the cloud services because they could lead to the most severe eco-
nomic damage if violated.
Once we get a complete inventory, it is possible to proceed as follows:
• Threat sources evaluation : Each asset identified during the previous step
may be targeted by many threats. The output of this stage includes
a list of menaces that must be prioritized depending on their impact,
intentions, and capabilities.
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 41
• Vulnerabilities identification : The vulnerability assessment allows the or-
ganization to understand whether there are weaknesses in the infrastruc-
ture that attackers can exploit. A single threat event can target multi-
ple vulnerabilities, constantly increasing in growing corporations, such as
Bynder, due to more extensive information systems and processes.
• Determine Likelihood and Impact : We can determine the likelihood and
impact of an event from the threat attributes and the company conditions.
The overall probability is calculated by combining the occurrence of the
event and the possibility of adverse effects caused to the systems. An
impact value should also be estimated by evaluating the potential harm
to assets, operations, and individuals.
• Risk value : The last step resolves the risk to the organization by taking
into account the impact and likelihood values. The risk may help iden-
tify which security measures can be applied to the prioritized assets to
minimize either the impact or the likelihood of the malicious events.
Figure 4.1: The risk matrix visualizes the final value from likelihood and impact.
The 5x5 matrix provides more insights into levels of severity by assuming a more
granular approach [29].
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 42
3. Sharing the results : Last but not least, the security department must design a
communication plan for concluding the risk assessment. The sharing method
may vary depending on the designated stakeholders and the organizational
policies.
The risk assessment NIST approach was applied to Bynder, as highlighted in the
following subsections. They generally describe the main assets and threats for the
cloud ecosystem, justifying the decision of the project’s use cases.
4.1.2 Risk Assessment: Critical Assets
For a business-to-business corporation providing cloud-based services through AWS,
the priority is for the cloud assets during an extensive environmental analysis. AWS
offers plenty of services for its customers to develop new applications. Creating a
web app is easy and fast, from making it reachable to the public to ensuring security
and reliability. As reported by Richter, Amazon Web Services is maintaining the
lead with 32% of the Cloud market [30] by providing a pay-as-you-go service that
allows customers to be billed according to the resources they are utilizing and the
duration they are using them.
Bynder relies on AWS and uses some of these services. Even though I cannot
be too specific about their usage, I can list some of the most diffused Amazon
resources. The assessment of Amazon Web Services technologies has been performed
by following the approach of the paper "Automated Security Assessments of Amazon
Web Services Environments", which suggests a complex solution to automate the
discovery of security gaps. Although this thesis does not explore its method, the
research mentions some of the most relevant Amazon solutions, giving insights for
identifying assets and threats [31].
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 43
• Amazon EC2: Elastic Cloud Compute is one of the oldest tools that provides
secure computing for applications. Customers can launch virtual computer
instances and configure the operating system, CPU, memory, storage, and
network.
• Amazon ECS: Elastic Container Service helps to build and manage apps with
container orchestration rapidly. Once the Docker image is ready, customers
can upload it to an ECS repository, setting the number of compute nodes.
• Amazon S3: Simple Storage Service can store content for the business,
maintaining its availability and offering security and scalability. The objects
are held into S3 Buckets, globally accessible and managed through Amazon
CloudFront, a content delivery network that helps deliver static content at
high transfer speed.
• Amazon Route 53: It serves as DNS, connecting the registered domain
names with the servers’ IP addresses. This service lets users request AWS
resources, such as S3 Buckets content.
• Amazon RDS: Relational Database Service is an easy-to-use tool that en-
ables setting up relational databases in the cloud. AWS provides all the ad-
ministration support needed by databases, automating some tasks, such as
patching and backup.
• Amazon Lambda: Serverless service that permits running code in the cloud
without managing machines. Users can write and upload the code to a Lambda
instance, creating a function.
• Application Load Balancer: AWS’s crucial service that permits distribut-
ing the traffic across instances or containers. It can decide to add a machine
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 44
whether the traffic is overwhelming the resources, monitoring the availability
of the applications.
• Amazon VPC: Virtual Private Cloud is one of the vital security services
that contains the running AWS resources and controls how external appli-
cations access and retrieve information. VPC enables data protection with
firewalls surrounding the infrastructure and maintaining the separation from
other people’s assets.
• Amazon IAM: Identity and Access Management helps to define roles and
permissions, assigning them to AWS users and groups and monitoring access
to AWS services. It is one of the central security tools since it manages who
can access what.
• Amazon CloudWatch: One of the cores monitoring AWS services that con-
trol the status of the applications. ClowdWatch collects logs and data events
and studies how resource utilization can be optimized, improving the perfor-
mance of the services.
• Amazon CloudTrail: It allows monitoring users, roles, and AWS services
actions to enable risk auditing, governance, and compliance. This tool can
scan who is doing what on the applications.
Within the B2B cloud landscape, S3 and RDS jointly compose the business’s core
infrastructure, offering scalable storage for data and digital assets and resilient
databases vital for cloud-based applications. Other AWS tools may be relevant
depending on the scope of the enterprise. Some will be introduced in the following
sections, talking about incident response, while others will not be mentioned since
they are not significant.
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 45
4.1.3 Risk Assessment: Threats Analysis
After identifying the main assets, I did some research to outline the main threats
and the vulnerabilities within the cloud, exploited to compromise the confidential-
ity, integrity, or availability of services and data. Serdal Kepil proves that the main
security issues of AWS technologies derive from a lack of configurations or creden-
tial theft and identifies multiple groups of dangers that should be evaluated while
guarding the environment [32]:
• S3 buckets misconfiguration: S3 Buckets should be adequately configured
when initiated in AWS. Attackers can exploit this vulnerability and compro-
mise the content if unrestricted access is permitted to all users. The buckets
can be targeted by ransomware when an attacker has read and write per-
missions; all the files can be encrypted, with a document left for extortion.
S3 buckets may also cause severe data leaks because malicious actors might
send undetected requests to access the objects within the storage service. If S3
access logs are not enabled, the attempts may remain hidden. Chickowski re-
ports Amazon S3’s worst breaches, showing that 7% of the servers are publicly
accessible, while 35% result unencrypted [33]. It is relatively easy to access a
public S3 Bucket (some websites allow researching for them, such as Grayhat
Warfare), so it is essential to care about their configuration.
• Malevolent AWS API request: Malicious code can be injected into one of
the numerous AWS APIs available, leading to Distributed Denial of Service
or SQL injections. APIs must be supported with adequate encryption options
and monitoring of Amazon CloudTrail, which can notify any unexpected API
call.
• Unfiltered traffic from untrusted sources: Network protection is a crit-
ical component of the AWS infrastructure. If the deployed instances can be
4.1 STUDYING THE COMPANY’S INFRASTRUCTURE 46
accessed under any condition and there are no rules to deny traffic, DDoS and
other attacks can be more easily performed against the AWS resources.
• IAM improper permission and privileges: Amazon IAM wrongly set
up permissions may allow unauthorized users to access sensitive information.
All users’ privileges must be reviewed periodically to avoid any weaknesses in
security and compliance.
• Credential theft: Several violations of cloud resources are possible thanks to
credential theft attacks through phishing or other techniques. Some companies
decide not to follow AWS best practices and do not enable multi-factor authen-
tication or do not implement monitoring solutions, leaving the environment
unprotected.
• RDS misconfigurations: Other critical resources that can be misconfigured
and subject to attacks are Amazon RDS. Despite databases containing sen-
sitive data, AWS is not responsible for securing the information, and there
are multiple ways for attackers to exploit vulnerabilities in their settings. Ac-
cording to the Product Management at Eureka Security, exposed resources,
weak encryption, and improper authentication may be fatal for the business,
especially when combined with an inadequate retention period [34]. In the
potential scenario of ransomware encrypting RDS, it becomes complicated
to recover data without a proper backup policy.
In conclusion, considering these top menaces for AWS cloud resources, it is evident
that many risks surround Amazon S3 and Amazon RDS, and they must be pri-
oritized while designing the incident response strategy. Therefore, the first threat
scenarios must be linked to these services, focusing on the most disruptive attack in
circulation: ransomware.
4.2 INCIDENT RESPONSE SCENARIO 47
4.2 Incident Response Scenario
Thanks to the risk evaluation, it was easy to prioritize some use cases rather than
others. Referring to Bynder, the decision involved the entire Information Security
branch, which received further recommendations from the Amazon Customer Sup-
port team. In the end, Ransomware in AWS was selected as the main scenario.
The program converted the urgency of ransomware targeting Amazon Web Services
resources into response actions. Consequently, it was possible to elaborate a play-
book, including general information to explain how ransomware acts and how to
deal with it, and runbooks, examining different incident response options depending
on impacted technologies.
4.2.1 Designing a Ransomware Incident Response Plan
As we know from Chapter 1, ransomware critically impacts business. It may
target AWS architectures, causing a disruptive effect on the operations and com-
promising sensitive information. Proper security tools and procedures should be
ready for a strategic and adaptive response. Some AWS services can store data
and assets, and Crypto-Ransomware may exploit weaknesses to act and block the
systems. Each ransomware use case demands a nuanced response, and while the
playbook provides the strategic framework, the runbooks translate these strategies
into actionable tasks. I finally came up with one playbook and two diverse runbooks
addressing the Amazon Web service use cases for Amazon S3 and RDS, which store
information and let the company run its cloud-based services:
• Ransomware in AWS Security Playbook: The playbook provides an
overview of the Incident Response plan. All the stakeholders can understand
what would happen in case of a ransomware attack compromise and which
high-level process is carried out by the operatives.
4.2 INCIDENT RESPONSE SCENARIO 48
• Ransomware Response for AWS S3 (Simple Storage Service) Run-
book: S3 buckets are exposed to ransomware, especially when not correctly
configured and monitored by Amazon native tools. In the third episode of AWS
The Safe Room centered on incident response, the Amazon Web Services Se-
curity Consultant, Jason Hurst, presents how easy it is to delete all the data
of an S3 bucket configured with improper permissions, leaving a simple text
file containing a request for the payment [35]. This runbook is critical to limit
the impact of an S3 bucket violation and includes the tech-level information
to identify, contain, investigate, and recover malicious activities.
• Ransomware Response for AWS RDS (Relational Database Service)
Runbook: As well as S3, RDS services can be vulnerable to ransomware.
Database information can be encrypted or removed by potential attackers,
and only a well-structured incident response plan can address the situation by
solving the issue. The runbook contains the best practices for Amazon RDS
to prevent Ransomware attacks and the technical guidelines for returning to
business.
The described runbooks were preferred to the ransom response to Amazon EC2.
Even though it is one of the primary services delivered by AWS, the overall risk
remains with medium severity, leaving exclusive precedence for S3 and RDS. Ac-
cordingly, this thesis will not cover the EC2 strategy. The implementation of the
step-by-step procedures will be discussed in Chapter 5, after a brief presentation of
the tools that assist incident response.
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 49
4.3 Available Tools for Incident Response
A fundamental preliminary step of the project was to understand which tools were
available. Each new technology can only be activated and configured with a proper
evaluation of its capabilities, effectiveness, and a consideration of the expenses. This
paragraph introduces vital solutions for ransomware opposition, illustrating what
cannot be missing within the incident response arsenal of the organization.
Extended Detection and Response
XDR is already mentioned in Chapter 2 as an advanced version of EDR, and it can
recognize malicious behaviors with the correlation of logs from different sources. The
choice of using an XDR solution for an IT company is related to its capabilities of
giving in-depth looks at the data in networks, endpoints, clouds, and applications, as
reported by Shaji George, A. S. Hovan George, T. Baskar, and Digvijay Pandey [36].
Security agents are installed on the company’s appliances, and XDR can monitor the
activities, from the network communications to the event logs. For instance, every
laptop employees use is furnished with an application that monitors the traffic and
regularly scans for viruses. All the information the agents collect is agglomerated in a
central platform accessible by the security team. The latter can respond to malicious
events and take countermeasures directly on the devices by isolating them from the
network or launching scripts. XDR is currently a must-have within an IT company
because it is a practical solution to administer the security of the infrastructure.
Network-based Intrusion Detection System
NIDS is a powerful ally for incident detection. This technology analyzes internal
and external network communications to identify any unexpected behavior, and,
eventually, it may alert the security specialist whether any malicious attempt is dis-
covered. The main issue for IDS was attributable to the false positive rate since
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 50
fake alerts were reported too often, overwhelming the security analysts with inac-
curate notifications. Now, with new NIDS trends, Deep Learning integration can
dramatically lower the false rate, allowing faster detection and clear correlation of
the events. The article of Zeeshan Ahmad depicts how Deep Learning can be used
for implementing more effective Intrusion and Detection Systems and how future
efficient NIDS framework may also detect zero-day attacks [37].
4.3.1 AWS Incident Response Tools
Although XDR and NIDS are great solutions to safeguard the environment, I want
to focus now on AWS native security tools that are more relevant considering the
scenarios. The senior security consultant at Amazon, Vesselin Tzvetkov, wrote an
AWS blog article underlining how to automate incident response in a cloud envi-
ronment. He proposed architectures and tools that should be in place to avoid
catastrophic consequences [38]. This section is dedicated to the native tools cited in
the runbooks.
Amazon GuardDuty
As the principal AWS security tool for incident response, GuardDuty enables mon-
itoring threats of instances, containers, users, databases, and storage. It offers a
complete set of security functionalities and takes advantage of machine learning to
create behavioral patterns of the environment to detect any anomaly. Thanks to
GuardDuty, it is possible to automate the response by creating automatic activities
triggered by specific actions. It can improve the visibility of the operations by show-
ing unexpected data access and logins, analyzing files for malware, and identifying
suspicious behavior in container workloads.
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 51
GuardDuty can be activated with multiple modules integrating with other AWS
resources.
• S3 Buckets: CloudTrail S3 data events are continuously examined, monitor-
ing accesses and activities of the S3 Buckets.
• EKS Audit logs and runtime monitoring: Amazon EKS (Elastic Ku-
bernetes Service) is used to run Kubernetes in the AWS cloud. This feature
allows a continuous security analysis of the audit logs and runtime monitor-
ing of Amazon EKS instances, such as file access, network connections, and
process execution.
• Malware detection: GuardDuty triggers an alert when an EBS volume is
compromised by malware. Amazon EBS (Elastic Block Store) is a block stor-
age service designed for EC2.
• RDS: GuardDuty looks for potential threats accessing RDS databases. It
analyzes the logins and creates a standard behavior for the organization.
• Lambda: Lambda functions execution generates network activity logs con-
stantly monitored for threat detection. GuardDuty may discover cryptocur-
rency mining or compromised functions communicating with known malicious
IP addresses.
Security teams should evaluate the benefits of these features by considering the
expenses for their activation. In a typical multi-account AWS architecture that sep-
arates the service account(s) from the security one, Amazon GuardDuty forwards its
findings to CloudWatch, which sends the events to the central security infrastruc-
ture. Each event is mapped with a Lambda Function response action immediately
initiated by calling the AWS System Manager. As a result, the automation is ex-
ecuted in the service account, and for example, a compromised EC2 instance can
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 52
be isolated when a certain event happens. It is crucial to test every operation in a
non-production environment.
Figure 4.2: The AWS infrastructure can benefit by implementing this architecture
with Amazon GuardDuty. Incident response automation may improve the whole
process and instantly block several types of attacks.
Amazon Security Hub
Amazon Security Hub is a critical tool for security and compliance. It collects data
from AWS accounts and active services to discover security issues [39]. Any devi-
ation from the AWS Foundational Security Best Practices, which includes a list of
controls, is reported in a central platform of the security account.
There are several advantages brought by Amazon Security Hub that also benefit the
preparation phase of the incident response plan. First, with findings prioritization,
each piece of information is collected from the accounts to a central place where
the security team can analyze the issues. The alert is assigned to a criticality level
(high, medium, low) that indicates which problem should be solved immediately
to preserve the protection of the cloud. Security Hub continuously runs automatic
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 53
security checks based on AWS best practices, and it is possible to automate the
updates and remediation of the findings by setting up custom actions on defined
criteria. This is a crucial tool to monitor the status of the resources, and, for
example, it controls if the encryption of RDS Databases is enabled or if any S3
bucket allows public access.
AWS Config Rules
Many controls are not applicable by using only Amazon Security Hub, and AWS
Config Rules are the best solution to keep track of all the configuration changes.
Indeed, AWS Config can monitor the active services’ settings of the AWS account,
and the rules permit the evaluation of the compliance information. For S3 Buckets,
we can enable s3-bucket-versioning-enabled to check whether anyone turns off S3
versioning or s3-bucket-public-read-prohibited to verify if we are avoiding public read
access. Moreover, AWS Config Rules for Amazon RDS, such as rds-in-backup-plan
monitor if RDS Databases are included in the AWS Backup plan. AWS created a
complete list of the Config Rules for its customers.
AWS IAM Access Analyzer
IAM Access Analyzer keeps track of resources and accounts shared with third par-
ties. It can detect unauthorized access to AWS resources, giving relevant insights
for identifying threat actors’ activities. The product manager, Mathangi Ramesh,
stated that the IAM Access Analyzer is also fundamental for generating fine-grained
policies based on the access activities of users. Therefore, this powerful mechanism
helps implement the least privilege principle, providing only the required workload
permissions [40].
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 54
AWS Self-Service Security Assessment
This is an important service mainly for organizations new to AWS that have yet
to implement other security measures. It is an easy-to-deploy, cheap solution for
quick security assessments across the cloud infrastructure. This utility has been de-
veloped starting from Prowler, another popular open-source project that performs
evaluations based on the best practices of AWS. It is relevant for incident response
plans because, as stated by the AWS Public Sector Blog Team, it proposes a func-
tionality to identify the missing controls to protect against Ransomware [41]. More
than 256 checks are launched over the cloud instances, and a final report suggests
which security measures should be added to the environment.
Amazon Athena Bootstrap
As John Haggerty and Thomas Hughes-Roberts noted, the complexity of incident-
related logs may lead to temporal constraints due to the massive amount of informa-
tion that can be recorded from operating systems and applications. The intricacy
increases when cloud technologies are considered, and the two researchers advise
being equipped with solutions that can rapidly classify and visualize the captured
data [42].
Amazon Athena Bootstrap is the AWS critical service for log analysis during the
eradication. It is a core serverless technology that allows inspecting several AWS
data sources, such as information contained within S3, and using SQL or Python.
The importance of Athena Bootstrap is clearly illustrated by the fifth episode of
AWS The Safe Room, in which the Engineering Lead of the AWS Customer Incident
Response team, Ryan W Smith, shows how to perform a log analysis [43]. In the
presentation, Athena takes the logs from the S3 buckets of the log-archive account,
which aggregates all the data received from all the organization’s resources.
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 55
Figure 4.3: Athena uses a specific user’s permissions (usually the analyst’s) to con-
duct the operations; the query results are stored in a third S3 bucket of the same
account, and the users examine it whenever needed.
The queries are submitted in the Athena editor, and we can simply use SQL to
search all the data recursively from the S3 Buckets. This tool is beneficial during
urgent investigations, such as the ones required in a security incident.
AWS Assisted Log Enabler
Another tool that can assist the eradication stage is Amazon Assisted Log Enabler,
showcased by the AWS Security Consultant Joshua McKiddy [44]. Its features focus
on turning on the logs, which are critical elements during an investigation. Assisted
Log Enabler initially creates the buckets to store the information and then checks
for all the running services within the cloud infrastructure: all the disabled logging
is promptly turned on. Thus, customers can study future and possibly ongoing
security incidents.
4.3 AVAILABLE TOOLS FOR INCIDENT RESPONSE 56
Amazon Detective
Detective is the Amazon Swiss Army knife for investigation. Even though many
organizations prefer to rely just on GuardDuty, Detective provides visual represen-
tations of the security issues, boosting the analysis and the events’ correlation. It
shows the users’ interactions with the resources and uses machine learning to pro-
duce activity diagrams.
Amazon S3 Versioning
Versioning is a remediation method that allows keeping multiple versions of the
objects in S3 buckets. By enabling this technology, AWS customers can maintain
the previous versions of the assets after they are modified. When an object is
deleted, it is not removed permanently, but it is signed with a delete marker (only
the bucket owner can permanently delete a version under certain circumstances, and
MFA is required). S3 Versioning is a great functionality that can be used during
the Recovery phase whenever a bucket is compromised.
Amazon Backup
AWS Backup is the leading service for the Recovery stage to restore all the im-
pacted resources. Data protection can be centralized across the cloud, from creating
a backup plan to setting the frequency and retention policy. These steps must be
carried out during the first part of the incident response plan to be prepared in case
of a potential violation.
Other AWS security tools might come in handy while responding to an incident.
However, they might be too expensive for the enterprises that usually discard them
from their security toolkit. Given that you have acquired background knowledge, I
can finally present the developed playbook and runbooks.
5 Incident Response Implementation
Implementation is the heart of the thesis since it describes how playbooks and run-
books for the incident response plan were deployed. Ransomware in AWS proposes
one playbook and two runbooks, describing the processes for the indispensable re-
sources inside the AWS cloud infrastructure, Amazon S3 and Amazon RDS. All the
documents differ from the original ones currently part of Bynder’s security posture
because they cannot disclose private information about the business, such as inter-
nal technologies or financial choices. Overall, they generally reflect on a potential
deployment for an incident response plan, also delivering helpful insights on specific
architectures.
5.1 Ransomware in AWS Incident Response Plan
The scenario recalls the most dangerous threat in the cloud: ransomware. Ran-
somware is one of the biggest challenges for IT businesses since it can encrypt all
data and significantly impact the services. As pointed out by the AWS report
"Securing your AWS Cloud environment from ransomware", these attacks are still
effective because awareness among personnel is low and critical vulnerabilities are
not patched within a restricted timeframe. Firms must define a consistent incident
response plan because they cannot pay the ransom. Sending money to criminals
would just raise their funds to carry out further attacks, and in addition, there is
no guarantee that they will share the decryption key even after the payment [45].
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 58
The security playbook provides a general overview of the process, and it was created
starting from the idea of the AWS security specialist solutions architect Brad Dis-
pensa, with his Amazon blog’s post "Ransomware mitigation: Top 5 protections and
recovery preparation actions" [46]. Although he highlights some preparation steps
for protecting against ransomware, there are no clues on the other incident response
stages gained from research and environmental studies.
The first runbook extracts, instead, information from the relevant publication of
Megan O’Neil, Kyle Dickinson, and Karthik Ram, underlining detection and re-
sponse for the Amazon S3 use case [47]. The collected insights were reprocessed
to adapt the operations according to Bynder’s cloud architecture. Two last useful
articles for Amazon RDS were the following:
• The AWS solutions architectsMarshall Jones and Deric Martinez released the
one who detailed GuardDuty’s role in RDS databases threat detection [48].
Even in this case, this writing is not enough to produce a complete incident
response runbook, and it was just an input for identification and eradication.
• The blog post "Investigate VPC flow with Amazon Detective" by Ross Warren
and Jim Miller illustrates how to determine the scope of security issues. VPC
Flow logs collect data about IP traffic from and to a cloud private network
and examine which addresses access RDS services [49].
Runbooks enclose flowcharts that characterize each step with basic instructions and
queries to investigate the attacks. In conclusion, these reworked versions follow the
SANS incident response framework. They present an overview of the use case and
the six incident response phases, showing how to deal with security violations.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 59
5.1.1 Ransomware in AWS - Security Playbook
Playbook Name Ransomware Response in AWS
Playbook Description Incident Response Plan for Ransomware at-
tack scenarios targeting the AWS infrastruc-
ture
Playbook Manager Information Security Senior Director
Informed Executives, Information Security, IT, De-
vOps, Legal Department
Impacted Assets AWS cloud services and resources
Version 1.0
Version Date 1 Apr 2023
Table 5.1: The table illustrates the general information about the playbook.
Overview
This Playbook informs the interested teams about the response procedure for ran-
somware attacks targeting AWS services. It ensures that all the stakeholders are
aligned on the overall strategy for this scenario. Ransomware may cause the business
billions of dollars of damage, interrupting operations. It tries to infect any acces-
sible network and needs to gain access to an organization’s infrastructure, encrypt
files, and demand ransom to be successful. Ransom attacks are effective for multiple
reasons:
• Low awareness among individuals;
• No backups available;
• Little effort and massive revenue for attackers;
• Slow vulnerability (CVE) patching;
• No planning strategy to address security issues;
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 60
• Multiple attack vectors;
• No customer reaction if the bad actors are exfiltrating data without encrypting
or deleting information.
Affected organizations should not pay the ransom: there is no warranty that
encrypted data and locked systems will be restored after the payment, which will
only motivate malicious behaviors. Therefore, Ransomware is a severe threat, and
responders must be adequately prepared to defend against it.
The following related runbooks include a step-by-step technical procedure for han-
dling an incident aiming at specific AWS technologies:
1. Ransom Response for AWS S3 (Simple Storage Service) - Runbook
2. Ransom Response for AWS RDS (Relational Database Service) -
Runbook
The incident response activities rely on security measures and AWS native services
within the cloud infrastructure.
Preparation
The preparation steps must be completed before an incident.
Initially, the preparation phase involves establishing a communication plan and re-
viewing the current security controls and policies.
1. Determine the CSIRT memebers
• CSIRT will be guided by an incident response manager responsible for
designing roles and responsibilities for the other members.
• CSIRT must include individuals from Information Security and DevOps
managing the AWS environment.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 61
• CSIRT must include one representative from the legal department to
address any concerns related to sensitive information or third-party or-
ganizations.
• CSIRT’s size must be limited to simplify communication and support
efficiency and confidentiality.
2. Establish a reliable communication channel
• Ensures Bynder’s instant messaging program and video communication
service are working correctly and are available during the incident re-
sponse.
• Create a dedicated channel with the messaging solution, adding all the
CSIRT members.
3. Conduct security awareness campaigns
• Security awareness campaigns increase employees’ knowledge and atten-
tion to internal security, significantly reducing the likelihood of an inci-
dent.
4. Review the current security measures and policies
• Set up the ability to recover apps and data : Bynder must ensure that it
can recover its data, regardless of how the system was made inaccessible.
AWS Backup can restore the data (simple snapshots of EC2 instances
are insufficient) and supports the cross-account capability, placing the
backups into separate and dedicated accounts. The likelihood that a
threat actor may temper with the backup is reduced.
• Encrypt your data : Encryption must have been enabled so that only au-
thorized users and services can access the resources. Encryption prevents
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 62
double extortion schemes, in which threat actors exfiltrate data, encrypt
it, and threaten its release.
• Apply critical patches : Misconfiguration and vulnerabilities make attack-
ers’ lives easier. Security-related patches with the slightest delay possible
avoid bad actors exploiting existing CVEs. Monitoring applications must
scan images and Kubernetes clusters.
• Follow a security standard : AWS Security Hub must be active to auto-
mate controls across AWS services and third-party tools, identifying gaps
in security according to AWS Foundational Security Best Practices.
Furthermore, the least privilege principle must enforce policies to limit
actors’ capabilities. AWS IAM Access Analyzer facilitates the generation
of permissions; short-lived IAM Access Keys must be applied to restrict
access duration to compromised resources.
• Strong Authentication : AWS suggests applying strong authentication
policies, combining Single Sign-On with Multi-Factor Authentication.
• Make sure to monitor and automate responses : Intense infrastructure
monitoring can help protect against Ransomware. Integrating AWS na-
tive and third-party tools with an AWS multi-account architecture leads
to an advanced cloud environment where the business can count on au-
tomation and prompt response.
Identification
The identification phase depends on the services that need protection and the tools
available for detection. Ransomware detection may occur in many different ways,
and the CSIRT must review one of the linked security runbooks for more specific
use cases.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 63
Overall, as mentioned by AWS, the attack can be notified in many different forms,
depending on the architecture configuration:
• An alert is generated on Amazon GuardDuty, AWS Security Hub, or alternative
monitoring systems;
• An EC2 instance cannot be accessed anymore, although it appears appropri-
ately configured and running;
• Amazon CloudWatch reports alarms about network reachability issues;
• The bad actor demands ransom via email or alternative communication chan-
nels.
Identification furnishes the first insights about the impacted AWS resources (S3,
RDS, etc.) and the initial attack vector the threat actor utilizes. Once the ran-
somware has been detected, CSIRT must create a dedicated communication channel
and a ticket that documents the response progress before proceeding with the con-
tainment.
Containment
Containment tries to mitigate the incident by limiting the damage to the business.
It might vary depending on the compromised AWS resource, and stakeholders may
have dissimilar opinions on containment (some would focus on returning to business
rather than fixing the vulnerabilities). However, early detection is essential to reduce
the impact of the incident. CSIRT must always prioritize critical information, and
they must follow the below steps, depending on the affected asset(s) (refer to the
incident response runbooks for more specific use cases):
1. Do not pay the ransom: The payment does not guarantee that the at-
tacker(s) will restore your data. If data is encrypted and no backups are
available, a representative from the financial team should help the CSIRT by
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 64
performing a cost-benefit analysis to weigh the value of the data/reputational
compromise against the payment to the attacker.
2. Determine the type of ransomware (if possible): ransomware can en-
crypt objects or files (Crypto-Ransomware), lock out access to the AWS in-
frastructure or resources (Locker-Ransomware), or others.
3. Limit access to network and/or resources: Modify the Security Groups to
isolate the network and limit access. Change S3 Bucket(s) policies and delete
IAM users, policies, and roles to minimize the opportunity for the attacker(s)
to spread the infection.
4. Check AWS tools logs: AWS logs (CloudTrail, S3 logs, etc.) may provide
insights about unauthorized activity, such as creating IAM users, policies,
roles, or temporary security credentials and deleting/revoking them.
5. Check for unpatched software: If the attacker(s) exploited unpatched
software, out-of-date OSs, or AV tools to spread the malware into the infras-
tructure, EC2 instances must be updated with all software packages, virus
signatures, and definition files. This may prevent the bad actor(s) from ex-
ploiting the same vulnerabilities (if the vulnerabilities cannot be fixed, CSIRT
must ensure that the security measures are blocking similar attacks).
6. Check for resources at risk: Remove any resources targeted by the same
attack vector, depending on the previous steps’ considerations.
Once the Containment stage is complete, CSIRT should have restricted all bad
actors’ activities, and the environment should be safe.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 65
Eradication
The eradication phase helps determine the attack vector and investigate any possi-
ble attempt the attacker(s) made to maintain access to the resources. During the
investigation, the CSIRT should look for every vulnerability in the AWS environ-
ment and add security measures to avoid any possible future damage. CSIRT has
to understand if the impact from the incident has been correctly contained and if
there is any ability to restore the ransomed data from backups/snapshots.
The investigation determines the impact and the amount of resources affected, with
the criticality of the disclosed data. CSIRT can assess the attack vector and retrace
the bad actors’ activities with the following steps:
1. Enable tools for logs analysis: AWS native services may help CSIRT ex-
amine the logs. Depending on the affected application(s), different tools can
be activated and used (supposing that Amazon GuardDuty and Security Hub
are already working).
• Amazon Athena uses SQL to query the records;
• AWS Assisted Log Enabler activates critical logs to identify the bad ac-
tor’s persistence;
• AWS Config Rules may detect any change in the infrastructure’s config-
uration;
• Amazon CloudWatch provides information on abnormal data transfer
spikes;
• AWS Detective helps visualize the correlations of the threat actor’s ac-
tivities;
• Amazon CloudTrail can detect unexpected user activity and API usage.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 66
2. Analyzing the logs and retracing the bad actor’s activity
• CSIRT must determine when the infection occurred and how through logs
analysis (CloudWatch can help you review logs such as application logs,
operating system logs, database logs, etc.)
• Reviewing the findings from GuardDuty and Security Hub may help re-
duce the additional effort required to search application-level logs.
• CSIRT must identify all the IoCs (Indicators of Compromise). Any mal-
ware identified during the analysis must be removed.
• Determine if any third-party decryption software is available when the
ransomware has been identified (No More Ransom Project may provide
a decryptor for the data).
• (Optional) If any unexpected activities or unauthorized IAM users, roles,
or policies are detected, the CSIRT should return to containment to iso-
late the environment and limit the threat actor’s capabilities.
3. Review the Security Controls
• CSIRT should assess the current security measures and evaluate whether
they are enough to perform quick and effective incident response actions.
• New Amazon GuardDuty features or AWS Config Rules may be intro-
duced, and they must ensure all the logs are available (AWS Assisted Log
Enabler may help with this task).
• AWS Self-Service Security Assessment may provide critical insights, iden-
tifying security measures that could improve the protection against ran-
somware.
4. Determine the business impact
• This step can be carried out ONLY after the incident has been contained
and the impact has been limited.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 67
• CSIRT must prove all applications impacted during the incident and any
data the attacker may have disclosed.
The bad actor may have compromised Confidentiality, Integrity, and
Availability, and the team must determine the incident’s overall impact
(low, medium, high, or critical).
• Any impact/experience of the issue must have been documented in the
incident ticket. This process will be helpful during the Lessons Learned
stage.
The end of the eradication guarantees the cloud operations security, and the firm
can now recover from the incident and return to business.
Recovery
Recovery is the incident response last phase, and CSIRT must be sure that all the
IoCs have been removed from Bynder’s infrastructure. This stage of the process
may vary depending on the impacted AWS services. Overall, CSIRT should go over
the below steps to complete the procedure:
1. Identify the restore point for any restore operation: Diverse Amazon
resources have diverse restore points (for instance, S3 buckets can be recovered
using S3 Versioning or AWS Backup).
2. Restore the data from the backup: Use the selected application (S3 Ver-
sioning, AWS Backup, etc.) to recover your data. Before restoring it, CSIRT
must confirm that it is clean (there is no guarantee that every detected IoC
has been removed from the AWS infrastructure).
3. Review the backup strategy: CSIRT must check if all the compromised
data can be recovered (this step will depend on the backup policies of the AWS
resources).
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 68
4. (Alternative) Decrypt your data: If any backup is available, use an open-
source decryptor to decrypt the data and, before restoring it, perform any
required analysis to ensure that it is clean. CSIRT can recharge the resources
by creating new instances and moving the decrypted data into them.
5. (Alternative) New environment: If neither backups nor decryptors are
valid, consider starting a new AWS environment.
The business would have enough resources to restore data from its backups if it
followed the preparation process. The team can now proceed with the Lessons
Learned.
Lessons Learned
Lessons learned are critical for the organization since they can provide essential
insights into the Incident Response procedure and how the incident affected the
services. CSIRT documented all the actions performed to contain the impact and
recover the data, reporting any information in the ticket. Lessons Learned focus on
improving incident response, starting from detecting the attack. All the stakeholders
should participate in the post-mortem activities to provide their perspectives on
managing the incident.
Reviewing Questions IR Phases Hints
Was the IR team prop-
erly organized?
Preparation Review the team formation and
prepare each member for possible
future incidents.
Were the employees
properly trained?
Preparation Create a new awareness campaign
to fix the gaps.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 69
How long after the initial
compromise was the inci-
dent discovered?
Identification Review Bynder’s detection solu-
tions (Amazon GuardDuty fea-
tures, AWS logs, AWS Config
Rules, etc.) and set up more as-
sertive notifications.
How long did the con-
tainment take? Was it ef-
fective?
Containment Review the containment and con-
sider automation solutions.
How can you mitigate the
risk of re-occurrence of
the incident?
Containment Review the mitigation and define
how it can be improved, making
the procedure faster, consistent,
and reliable.
After eradication, did
signs of the compromise
still show up?
Eradication Review the eradication and think
about potential improvements
(Amazon Detective, Amazon
Athena, etc.).
Were the impacted AWS
resources correctly re-
stored?
Recovery Review the recovery and evalu-
ate new solutions to accelerate it
(AWS Backup, etc.).
Can any step of the IR
process be automated?
All Review each incident response
step, determining whether ac-
tions may be automated.
Table 5.2: This is an opportunity to strengthen the overall security posture, and it
may consider the questions in the table.
A final technical report that would be helpful in case of future incidents and the
playbook’s workflow must be updated considering the outcomes of Lessons Learned.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 70
5.1.2 Ransom Response for AWS S3 - Security Runbook
Runbook Name Ransomware Response for AWS S3 (Simple
Storage Service)
Runbook Description Incident Response Runbook for Ransom at-
tacks targeting Amazon S3 Buckets
Runbook Manager Information Security Senior Director
Informed Information Security, IT, DevOps, Legal De-
partment
Impacted Assets Amazon S3 buckets and objects
Version 1.0
Version Date 1 May 2023
Table 5.3: The table illustrates the general information about the runbook.
Overview
This Runbook is a step-by-step procedure to respond to a ransom attack involv-
ing Amazon S3 buckets. Amazon S3 is a storage service offering security and data
availability. Organizations store their contents in S3 buckets that must be protected
from unauthorized access.
Ransomware can target Amazon S3 buckets’ data, stealing and possibly modify-
ing/deleting it. Threat actors may exploit a wide range of vulnerabilities to gain
unauthorized access to the target’s system, taking advantage of unpatched software
flaws, weak credential misuse, or social engineering. This record delivers the essential
guidelines for responding to a ransomware event in Amazon S3.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 71
Preparation
The preparation steps must be completed before an incident.
Figure 5.1: The flowchart illustrates how to approach the preparation for ransom
attacks involving Amazon S3 buckets.
1. Determine the CSIRT members
• The incident response manager will monitor the whole process, supervis-
ing the team and assigning tasks to each member.
• The operative members will receive indications from the manager to con-
tain, investigate, and recover the incident. They will examine the poten-
tial attack vectors and look for IoCs.
• The IT representative will be central in dealing with internal user requests
when needed. The participation of IT keeps their department informed
about the progress.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 72
• The legal representative will address any concerns related to sensitive
data or external organizations.
2. Establish a reliable communication channel
• A dedicated communication channel for CSIRT must be used for sharing
progress, reporting encountered issues or planning meetings. It must be
immediately created.
3. Conduct security awareness campaigns
• Bynder must periodically inform employees about authentication policies
(password complexity, MFA, etc.) and security best practices. Present-
ing phishing attempts and the best ways to protect the account reduces
the likelihood of incidents.
Developers must be trained with crafted campaigns on Amazon S3 secu-
rity.
4. Review AWS security measures
• Evaluate using AWS Self-Service Security Assessment ’s Ransomware mod-
ule that generates security assessment reports indicating the susceptibility
to ransomware damage.
• Control if Amazon GuardDuty is properly running: the S3 features must
be activated to protect data stored in S3 buckets.
• Apply these AWS Config rules to monitor the configuration changes af-
fecting Amazon S3 services.
• AWS IAM Access Analyzer must monitor access to Bynder’s resources
and data. Ensure unintended read or write actions are notified and per-
missions are correctly applied (s3:* is not permitted).
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 73
5. Check the protection policies
• All S3 objects must have been encrypted using AWS KMS (Key Manage-
ment Service) for generating encryption keys (s3-default-encryption-kms
Config rule checks if S3 Buckets are correctly encrypted).
• S3 Versioning must be enabled to restore modified or deleted objects (s3-
bucket-versioning-enabled Config rule checks if S3 Versioning is correctly
enabled).
• Amazon S3 MFA Delete must be enabled to double-check the activity
before deleting any S3 object.
• Passwords must follow the complexity policies, and MFA must be en-
forced to access AWS.
• Amazon S3 Buckets Replication must be activated to maintain object
copies under different ownership and store them over multiple AWS re-
gions.
6. Check logging
• AWS CloudTrail event logging for S3 buckets and objects informs about
Amazon S3 activities. CloudTrail Server Level Logging for S3 Buckets
also records the requests made to a bucket (by default, Amazon S3 does
not collect server access logs).
• Control if Amazon S3 logging works (use the AWS Config rule s3-bucket-
logging-enabled ).
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 74
Identification
A ransomware attack targeting AWS S3 Buckets can be performed in many different
ways, and in this regard, organizations require multiple controls, such as Amazon
GuardDuty.
Figure 5.2: The flowchart incorporates as many valuable detection scenarios as pos-
sible. The effectiveness of the identification phase depends on the security controls
in place.
1. Ransomware incident detected
• AWS native security services (GuardDuty, CloudTrail, AWS Config, etc.)
may report unauthorized activities performed within the environment.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 75
• Compromised S3 buckets can be generally identified by one of the follow-
ing:
 Unauthorized IAM user creation;
 Unauthorized IAM policies creation;
 Unauthorized IAM roles creation;
 Unauthorized IAM temporary security credentials creation;
 Unauthorized AWS usage (EC2 instances, Lambda functions, etc.);
 Unexpected bills from the Billing console;
 Ransom note provided within the S3 bucket or via email;
 S3 objects are deleted unexpectedly;
 Entire S3 buckets are deleted unexpectedly;
 Data exfiltration evidence;
 S3 Objects are inaccessible because they are encrypted with an ex-
ternal key.
• Other unexpected activities can turn out to be ransom attacks and must
not be underestimated.
2. Categorize the incident
• The incident severity can vary depending on the assets that are impacted.
Usually, ransom attacks targeting Amazon S3 buckets are classified as
Critical.
• Depending on the specific S3 bucket, the severity can change.
3. Create a ticket to document the response progress
• CSIRT will update the ticket by documenting the response actions and
the constraints that have been found. The ticket must include:
 Incident description;
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 76
 Date and time of the identification;
 Impacted instance(s) description;
 Type of affected data (e.g., personal data);
 Root cause analysis (RCA);
 Containment and eradication summary;
 Motivation of CSIRT’s activities;
 Other relevant evidence/information.
When an incident is discovered, CSIRT must start the containment immediately to
limit the attack’s impact (no investigation must be carried out for now).
Containment
Containment is a critical step to minimize the incident’s impact, blocking the at-
tackers’ actions and the financial damage to the organization.
1. Do not pay the ransom
• AWS recommends not paying the ransom regardless of the type of ran-
somware (Crypto-Ransomware, Locker-Ransomware, or others) since the
business cannot know whether the criminal will honor the transaction
after receiving the payment.
• If data is encrypted and no backups are available, a representative from
the financial team should help the CSIRT by performing a cost-benefit
analysis to weigh the value of the data/reputational compromise against
the payment to the attacker.
• No More Ransom Project helps find an available decryptor for the ran-
somware that encrypts your data.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 77
Figure 5.3: This flowchart provides multiple steps CSIRT must follow to address
the incident.
2. Rotate or delete the IAM user keys and the Root user keys
• The first step to contain the attack is to deal with the IAM User Keys
and the Root User Keys: the CSIRT must identify which Keys have been
exposed to proceed with their rotation or deletion.
• The team can then go on with the rotation or the deletion of the IAM
User Keys and the Root User Keys that have been exposed. All the
Keys must be rotated or deleted whether the specific key(s) cannot be
identified.
3. Delete unauthorized resources
• The attacker(s) may create unauthorized resources inside the infrastruc-
ture during its period of action. The CSIRT must ensure that all these
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 78
unexpected users, policies, or roles are deleted during the containment.
• The CSIRT will continue monitoring for every unexpected change.
4. Revoke temporary credentials
• Temporary security credentials cannot be revoked and remain valid until
expiration. CSIRT can remove all permissions from the temporary cre-
dentials denying access to them.
There are specific use cases that may be taken into account [50]:
 Denying access to the creator of the temporary security credentials ;
 Denying access to temporary security credentials by name ;
 Denying access to temporary security credentials issued before a spe-
cific time .
• Alternatively, the team may delete the IAM User to revoke the creden-
tials, although doing so may impact production workloads.
Containment is a critical phase and must be conducted carefully. Once the ransom
attack targeting Amazon S3 has been contained, CSIRT can proceed with eradica-
tion and recovery.
Eradication
Eradication helps determine the attack vector and investigate any possible attempt
the attacker(s) made to maintain access to the resources. The investigation is the
first stage of eradication, and CSIRT has to look at and mitigate every vulnerability
in the AWS environment, avoiding any possible future damage. The CSIRT evalu-
ates how the attacker(s) gained access to the infrastructure, whether they created
any further unauthorized resources, and whether any other services or third-party
companies were engaged in the violation.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 79
Figure 5.4: This eradication flowchart illustrates how CSIRT must conduct the
investigation.
1. Enable AWS logs analysis tools
• The first step of the investigation involves the logs analysis; in this regard,
many AWS tools may be helpful for the CSIRT.
• AWS Athena makes it easy to query the logs already stored in Amazon
S3 using the IAM permissions of the analyst. It can be activated directly
during the eradication (or before when necessary).
• AWS Assisted Log Enabler checks whether the business is missing any
logs. It facilitates the investigation of the ongoing security incident.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 80
2. Retrace the attacker’s activities
• The team has to track all the attacker’s activity from the initial compro-
mise.
• Amazon S3 Server Logs capture detailed information about the opera-
tions performed on the S3 buckets at the server level. Logs may provide
clues such as the time of the requests, IP addresses, request types, and
more.
• Amazon S3 Access Logs provide information about the requests made to
objects within the S3 bucket. They include the time of the requests, IP
addresses, requested object’s key, and more.
• AWS CloudTrail dashboard and event history can provide CSIRT with
IAM-related activities, and the team must search for logs from compro-
mised IAM users or roles. CloudTrail Event Logs detect potential persis-
tent access that a malicious actor may have created.
• CSIRT may query CloudTrail logs (using Amazon Athena) to identify
the last-used access key, user creation time, and last-used passwords.
Otherwise, the team may consult the IAM Credential Report from the
IAM console (updated every 4 hours).
Figure 5.5: The code snippet queries CloudTrail event logs, selecting an IAM user’s
ARN to show its activities in a specific time frame (with source IP and user agent).
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 81
• Amazon CloudWatch gives information on abnormal transfer spikes. CSIRT
can track the activities until the initial attack vector. Alternatively, the
region-DataTransfer-Out-Bytes metric (enabled by default) shows the
amount of data transferred from S3 to the internet.
• Configuration changes trigger AWS Config rules and can be related to
the attacker’s activity.
• Once the analysis is complete, CSIRT must have discovered the attack
vector, all the impacted AWS resources, and any unauthorized IAM users,
policies, or roles.
3. Repeat Containment (when necessary)
• (optional) If CSIRT discovers the attacker’s persistent access through any
further unauthorized IAM user, policy, or role, it must revisit containment
to limit the impact on the business.
4. Review the security controls
• CSIRT revisits the controls of the preparation phase (sections 4, 5, and
6). In particular, Amazon GuardDuty ’s features and AWS Config rules
must be enabled, and all the logs must be available (refer to AWS Assisted
Log Enabler ).
• CSIRT may introduce new AWS native security tools or features.
5. Assess the impact of the incident
• This step can be carried out ONLY after the incident has been contained
and the impact has been limited.
Not only confidentiality but integrity and availability may have been af-
fected: probably, the incident may have a critical impact. CSIRT must
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 82
have identified all the AWS S3 that have been impacted, and it is respon-
sible for determining whether Bynder or customers’ information has been
leaked.
• If either customers’ information or personal data are involved in the inci-
dent, the legal representative of the CSIRT is in charge of analyzing the
impact, evaluating the investigation outcomes, and eventually informing
the interested corporations. The assessment should be documented by
reporting the following:
 Which measures against unauthorized processing of personal data
were in place, and how these protections were likely compromised;
 Potential consequences for data subjects;
 Regulatory and contractual obligations regarding notifying authori-
ties, data subjects, and other third parties;
 Other applicable regulatory and contractual obligations.
Once eradication is concluded, the attack vector and all the impacted resources
should have been detected, and the CSIRT can move on to the Recovery phase.
Recovery
Recovery is the last incident response step, and CSIRT must be sure that all the
IoCs have been removed from Bynder’s infrastructure.
1. Address any weaknesses found during the investigation
• CSIRT should be able to address any weakness of the cloud environment.
• CSIRT can propose additional measures for protection, such as S3 Buck-
ets Replication, AWS Detective, or external solutions (e.g., SIEM ).
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 83
Figure 5.6: CSIRT relies on the recovery flowchart to restore the affected Amazon
S3 bucket(s).
2. Recover by using S3 Versioning or AWS Backup
• Use S3 Versioning to restore the previous version of the objects that
have been compromised during the attack. Versioning is keeping multiple
versions of an object in the same bucket and gives the ability to restore
a particular version during the recovery. The team should select the
previous version of the affected bucket(s) and return to business.
• If S3 Versioning cannot be used, rely on AWS Backup and recover the
modified/deleted data from the AWS backup account (the architecture is
divided into service(s) account(s) and backup account).
3. Continue monitoring the activity
• Once the recovery is complete, CSIRT must continue monitoring the sit-
uation by checking for alerts from AWS security tools. Discovering unex-
pected IAM users, policies, or roles is still possible, and the team should
be ready to respond promptly.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 84
The incident response process is ended, and CSIRT can start the post-mortem,
wrapping up outcomes and constraints.
Lessons Learned
Lessons Learned go over the entire incident response process again, documenting
the main constraints and what can be improved. This stage can provide concrete
insights to understand better how the architecture was breached and where the
business was vulnerable. Lessons Learned may focus on where visibility was lacking
and how automation can improve the response procedure. This is an opportunity
to strengthen the security posture, addressing S3 and the AWS infrastructure. The
technical incident response members must plan a meeting with all the stakeholders
and reply to the questions in the playbook’s Lessons Learned section.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 85
5.1.3 Ransom Response for AWS RDS - Security Runbook
Runbook Name Ransomware Response for AWS RDS (Rela-
tional Database Service)
Runbook Description Incident Response Runbook for Ransom at-
tacks targeting Amazon RDS
Runbook Manager Information Security Senior Director
Informed Information Security, IT, DevOps, Legal De-
partment
Impacted Assets Amazon RDS databases
Version 1.0
Version Date 20 May 2023
Table 5.4: The table illustrates the general information about the runbook.
Overview
This Runbook is a step-by-step procedure to respond to a ransom attack targeting
Amazon RDS services. Amazon RDS (Relational Database Service) is a collection
of services to set up, operate, and scale databases in the cloud. It has several prop-
erties to guarantee data security, such as at-rest and in-transit encryption, access
control, network isolation, firewall, and others.
However, ransomware can target cloud databases by gaining access to systems, en-
crypting data, and denying legitimate users’ requests. RDS is required to be pro-
tected from stealing and deletion. Threat actors can exploit a wide range of tactics
to gain unauthorized access to the target’s system: taking advantage of unpatched
software flaws, weak credential misuse, or social engineering.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 86
This document provides essential steps for responding to a ransomware attack in
Amazon RDS. Some stages of this runbook are similar to the ones of ransom at-
tacks targeting Amazon S3 instructions. Therefore, those will not be covered again.
Nevertheless, the actual procedure is completed with all descriptions to avoid mis-
understanding between team members.
Preparation
The preparation steps must be completed before an incident.
Figure 5.7: The flowchart illustrates how to approach the preparation for ransom
attacks involving Amazon RDS.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 87
1. Determine the CSIRT members
2. Establish a reliable communication channel
3. Conduct security awareness campaigns
4. Review AWS security measures
• The AWS Self-Service Security Assessment ’s Ransomware module fo-
cuses on weaknesses that can lead to this violation, while the Amazon
GuardDuty RDS feature analyzes database access.
• Apply AWS Config rules for RDS that assess the configuration of Rela-
tional Database Services.
• Review AWS IAM policies to determine who can read, modify, and delete
database instances.
• Evaluate deploying a third-party host-based intrusion detection system
(HIDS) for database violation detection.
5. Check the protection policies
• The database instances must run in an AWS Virtual Private Cloud, en-
abling control over the virtual networking environment. VPC restricts IP
address range, creates subnets, and configures routing and ACLs.
• Control Amazon RDS encryption that secures data at rest, including
backups, replicas, and snapshots. Industry-standard AES-256 algorithm
encrypts data of the DB instances stored on the servers.
• TLS (Transport Layer Security) must encrypt the databases (MySQL
and PostgreSQL) communications.
• Use database security features to prevent unauthorized access to the DB
instances. Control and Monitor who can log in to the databases.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 88
• Use and periodically review Security Groups to control the IP addresses
and EC2 instances that can connect to the databases on a DB instance.
• Check that AWS Backup technology is appropriately configured to in-
clude all the RDS instances.
6. Check logging
• Verify AWS CloudWatch logs collection. Its metrics monitor threat ac-
tor(s) activity, discovering data exfiltration.
• VPC Flow Logs must be activated to identify unauthorized access to DB
instances from external IP addresses.
• Verify Amazon RDS logging (turn on the AWS Config rule rds-logging-
enabled).
Identification
Bad actors may violate Amazon RDS in multiple ways, and identification technolo-
gies must be deployed. The effectiveness of the identification phase depends on the
security controls in place (their operation must be reviewed occasionally).
1. Ransomware incident detected
• The following anomalies may generally identify AWS RDS ransom at-
tacks:
 Suspicious behavior reported by AWS GuardDuty, such as data ex-
filtration attempts;
 EC2 instances reveal unexpected logins, installation of unknown soft-
ware, or the presence of unrecognized files;
 AWS Config rules report suspicious changes to the configuration of
the resources;
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 89
 Use AWS CloudWatch metrics to search for exfiltration spikes (open
the AWS console >select "All Metrics" >click on "NetworkPacket-
sOut");
 VPCFlowLogs may notify unknown external IP addresses accessing
databases;
 A bad actor destroys data and leaves a ransom note;
 A Bynder’s customer communicates evidence of a violation.
• Other unexpected activities can turn out to be Ransom attacks and must
not be underestimated.
Figure 5.8: The flowchart tries to include as many different detection scenarios as
possible for Amazon RDS.
2. Categorize the incident
3. Create a ticket to document the response progress
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 90
Containment
Containment is a critical step to minimize the incident’s impact, blocking the at-
tackers’ actions and the financial damage to the organization.
Figure 5.9: CSIRT must follow this chart to limit the incident’s damage.
1. Do not pay the ransom
2. Rotate or delete the IAM User Keys and the Root User keys
3. Delete unauthorized resources
• The CSIRT must ensure that all the unexpected resources created by the
attacker(s) are deleted. Besides unauthorized users, policies, and roles,
the team must search for any public snapshot or database.
• Compromised RDS databases must be deleted to limit attacker(s) capa-
bility. Be sure that the backup solutions are correctly working before
removing the data.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 91
4. Revoke temporary credentials
5. Identify EC2 instances with too permissive access
• Bad actor(s) may exploit EC2 permissive access to the database(s) to
exfiltrate information. CSIRT must identify any EC2 instances that had
permissive access and block them, limiting any further malicious activity.
Once the RDS service has been isolated, CSIRT can proceed with eradication and
recovery.
Eradication
Eradication helps discover all the IoCs that allow the attacker(s) to maintain access
to the RDS services. CSIRT must investigate how the threat actor(s) obtained per-
missions to read, modify, or delete RDS databases. Every additional unauthorized
resource or service must be removed, and the legal department must inform every
external organization involved in the security breach.
1. Enable AWS logs analysis tools
• CSIRT improves investigation abilities by activating AWS Detective. Since
most of the actions related to RDS services are visible through Amazon
VPCFlowLogs, Detective can show visual summaries about the network
flows without impacting the existing flow log collection. This technology
is critical to trace the root cause instantly.
• AWS Assisted Log Enabler enables all the missing logs. CSIRT needs as
much information as possible about the ongoing security incident.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 92
Figure 5.10: Eradication is more complicated than the previous stages, and the
incident response manager must assign tasks to make it faster and more effective.
2. Retrace the attacker’s activities
• With new tools available for the investigation, CSIRT must start studying
threat actor(s) behavior and how RDS databases were infected. Amazon
Detective elaborates data to present observed behaviors and guidance
for interpretation. The overall VPC flow volume panel illustrates
inbound or outbound traffic spikes that may be revealed as malicious
communications.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 93
Figure 5.11: Selecting specific ranges, CSIRT can figure out which IP addresses
exchange information, the data transmission ports were used, and which communi-
cations were granted or rejected [49].
• Amazon Athena can ease the investigation of CSIRT through the environ-
ment set up by AWS Amazon Analytics Bootstrap, which aggregates the
service logs of AWS CloudTrail, Amazon VPC FlowLogs, and Amazon
Route 53 DNS resolver query logs. AWS CloudTrail assesses persistent
access thanks to the records of IAM users or roles. Even with Amazon
Athena, CSIRT can get a summary of the number of bytes sent or received
by an IP address.
Figure 5.12: The query returns the bytes exchanged by the IP 192.0.2.1 between
the 20th and the 28th of August via port 443. Other interesting queries can be
found on the GitHub page of Amazon Security Analytics Bootstrap .
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 94
• AWS Config must work non-stop to observe configuration changes across
RDS resources.
• This step ends when the incident response team realizes the initial attack
vector and pinpoints all the affected resources.
3. Repeat Containment (when necessary)
• (optional) If the infrastructure turns out to be still affected by the ran-
somware, CSIRT must replicate the containment phase, limiting further
damage.
4. Review the security controls
• CSIRT reviews and evaluates using new native features for detection,
containment, and investigation (Amazon GuardDuty RDS features, AWS
Config rules, etc.) or third-party solutions.
5. Assess the impact of the incident
• This step can be carried out ONLY after the incident has been contained
and the impact is limited. The overall severity is critical, considering
that Amazon RDS is a vital service for the business.
• If either Bynder or customers’ information has been leaked, the legal
member of the team is responsible for contacting the interested entities.
The assessment should report:
 Which measures against unauthorized processing of personal data
were in place, and how these protections were likely compromised;
 Potential consequences for data subjects;
 Regulatory and contractual obligations regarding notifying authori-
ties, data subjects, and other third parties;
 Other applicable regulatory and contractual obligations.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 95
Eradication must have isolated the RDS instances, snapshots, and backups from any
risks, and CSIRT can start restoring the data.
Recovery
The last phase of the procedure aims to reinstate data into RDS databases, quali-
fying the return to business.
Figure 5.13: Besides addressing the cloud weaknesses, CSIRT must recover the RDS
instances following the flowchart.
1. Address any weaknesses found during the investigation
• CSIRT must fix the vulnerabilities of the preparation, identification, and
containment before restoring the information. The attacker(s) may infect
the system again. The technical members can activate new AWS native
security tools or rely on third-party applications or devices.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 96
2. Recover RDS instances
• If AWS Backup or the custom backup solution includes all the breached
instances, the CSIRT can remove all the compromised or unauthorized
databases or snapshots. This step ensures that the attacker(s) will not
have access to the system in the future.
• The DevOps team members can create new Amazon RDS instances and
transfer all the data using the chosen backup technology.
Figure 5.14: Navigating to the backup section of Amazon RDS on the AWS man-
agement console, CSIRT can select Restore, initiating the recovery of the instances
[51].
3. Continue monitoring the activity
• Restoring the Amazon RDS instances does not mean the risk of new
attacks is mitigated. CSIRT must continue monitoring the situation
through GuardDury and other security tools. The team must always
be ready for a rapid response.
After restoring the production environment, CSIRT can advance to the post-mortem
activities, focusing on final reporting and discussion with executives.
5.1 RANSOMWARE IN AWS INCIDENT RESPONSE PLAN 97
Lessons Learned
The incident response post-mortem tasks for ransom attacks targeting Amazon RDS
are parallel to the ones of Amazon S3. CSIRT must have recorded all the activities
in a ticket or a document that must be shared with Bynder’s top-level executives.
Together with them, the incident response manager and some operative members
must exchange opinions about how they dealt with the incident, referring to the
review questions of the "Ransomware in AWS" playbook. This opportunity can
improve the procedure’s effectiveness and strengthen the overall security posture.
After implementing the Ransomware in AWS security runbooks, testing their appli-
cation is imperative. AWS offers an excellent simulation solution: AWS Cloud-Saga.
Testing concludes the incident response plan for a complete action.
6 Incident Response Testing
The ability to effectively respond to incidents is essential to minimize the damages.
However, organizations can deploy thousands of playbooks and runbooks with de-
tailed instructions for the technical incident response team, but they would only
know if they are adequately working once the business is attacked by ransomware
or similar malware. At that moment, the company would discover whether the inci-
dent response manager can adeptly guide the department through the various stages
of the procedures. This approach is hazardous and not adequate for the security
standards that have been examined in this thesis.
For this reason, the business needs a pivotal practice for preparing to navigate
multiple cyber threats before the breach happens: incident response testing. This
chapter explores the importance of incident response simulations, particularly within
AWS, which proposes a service designed to test CSIRT preparation.
6.1 The Role of Simulation in Incident Response
Incident response is an intricate practice that combines human expertise, security
tools, and rapid decision-making. Giddeon N. Angafor, Iryna Yevseyeva, and Ying
He’s study confirmed that tabletop exercises can be crucial in CSIR training because
they encourage collaboration and communication, which are essential within such a
team. Cyber incident simulations are usually made up of executives and interactive
6.2 INCIDENT RESPONSE SIMULATION WITH AWS CLOUDSAGA 99
exercises to test staff with diverse backgrounds and skills to collaborate and share
information [52]. Therefore, simulating data breaches, network compromise, and
other attacks is necessary to anticipate the unpredictable. Responders can train to
deal with real-world scenarios, familiarizing themselves with the process and under-
standing the significance of the communication. Moreover, since the team works
in a high-pressure situation, each individual can learn how to manage the stress
while remaining focused on the tasks. Last but not least, the incident response plan
may appear complete and functional, but on the other hand, testing helps identify
weaknesses and gaps in the procedure.
Defining a simulation plan and practicing under controlled conditions is imperative
for learning how to work together effectively and ensuring a unified response in a
crisis. Amazon recognized that testing is as necessary as the other incident response
steps and created AWS CloudSaga, a tool for simulating attack use cases in safe
cloud environments [53].
6.2 Incident Response Simulation with AWS Cloud-
Saga
AWS CloudSaga is a powerful technology that allows orchestrating and executing
incident response simulations within the AWS domain. The first rule for testing is
to avoid the production environment since we may provoke delays or malfunctions
to the company’s services or, in the most unfortunate cases, compromise the avail-
ability of the resources. Indeed, AWS sets up a protected environment for Cloud-
Saga, permitting it to work freely and without worries about production. Amazon
CloudSaga can validate the response strategies, determine areas of refinement, and
reinforce playbooks and runbooks.
6.2 INCIDENT RESPONSE SIMULATION WITH AWS CLOUDSAGA 100
Users may explore its main features after entering a straightforward command to
install it on the AWS management console.
Figure 6.1: AWS CloudSaga includes several use case scenarios, such as Bitcoin min-
ing, compromised IAM credentials, and exposed resources. They can be customized
to meet specific AWS configurations and architectures.
AWS CloudSaga use cases simulate prominent attack vectors that can lead to a
ransomware violation. For instance, "network changes" create a customized AWS
VPC and modify the Security Groups. Thus, this exposure may be exploited by
a potential attacker, infecting the system with ransomware (Security Groups’ mis-
configuration accepts connections from known malicious IP addresses). However,
the most valuable scenario considering the runbooks deployed in this thesis is, with-
out any doubt, "public resources". CloudSaga creates public S3 buckets and RDS
databases within the AWS account, and their data may be modified or deleted before
leaving a ransom note.
6.2 INCIDENT RESPONSE SIMULATION WITH AWS CLOUDSAGA 101
The security team must monitor AWS security tools to spot the simulated issues
and kick off the incident response protocol. CSIRT can now follow the runbooks,
developing the team coordination and modeling the joint decision-making actions.
At the end of the simulation, Amazon CloudSaga deletes all the unexpected re-
sources and generates detailed reports to highlight strengths, weaknesses, and areas
of improvement, depending on how the group reacted.
In conclusion, the significance of incident response simulations in AWS becomes
clear. By proactively preparing for the worst, organizations can achieve the best
outcomes, continuously adapting to a dynamic cloud ecosystem. Regrettably, as
previously mentioned, I did not have time to try out AWS CloudSaga within Bynder,
even though it would have been a perfect occasion to point out the strengths and
flaws of the incident response plans. Nonetheless, every now and then, InfoSec and
DevOps departments schedule meetings to test the performance of the manuals, and
they will use this powerful resource.
7 Conclusion
7.1 Evaluation and Results
The journey undertaken in this thesis culminated in creating accurate incident re-
sponse playbooks and runbooks for Amazon Web Services ransomware scenarios, fo-
cusing on Amazon S3 and Amazon RDS. These blow-by-blow guides offer a roadmap
that not only Bynder but other organizations can follow when facing the challenge of
a ransomware incident. They may readjust these complex playbooks and runbooks,
considering their cloud architecture.
An AWS Customer Support Specialist was involved in the process, providing valu-
able insights and validating the efficacy of the developed playbooks and runbooks.
He suggested maintaining documentation to register each step during the prepara-
tion phase, guaranteeing a more robust and transparent incident response process.
Additionally, the AWS support team must be involved in the strategy since their
Technical Account Manager (TAM) has more experience dealing with cloud threats
and can help the incident response manager administrate the activities. TAM’s con-
tact must be included in the runbooks to communicate the violation immediately.
In conclusion, the objective was to bolster the security posture of Bynder with this
comprehensive solution covering preparation, detection, containment, eradication,
7.2 FINAL REMARKS 103
recovery, and lessons learned, ensuring a holistic approach to incident response.
The result was accomplished because this study provided tailored incident response
documentation to enhance the security landscape of the AWS domain. Amazon
S3 and RDS are critical services for all the companies based on AWS as a cloud
provider. Completing playbooks and runbooks for the ransomware use case equips
them with a structured approach to counteract threats effectively and with faster
decision-making.
7.2 Final Remarks
As we conclude, it is essential to recognize the impact of this thesis on the companies
relying on AWS cloud infrastructure. Businesses with a general document describing
their incident response approach without practical advice on carrying out the pro-
cedure would be in trouble while facing ransomware targeting the AWS resources.
Lack of communication, poor automation, and no decision-making experience dur-
ing high-pressure situations would have a critical impact on the business. The initial
problem of ransomware targeting AWS cloud services has been met with a resolute
response, a suite of procedures guided by Amazon Web Services, NIST, and SANS
best practices. Overall, the infrastructure vulnerable to many evolving cloud threats
has been transformed into a fortress with a step-by-step, well-structured incident
response plan.
7.3 FUTURE TRENDS 104
7.3 Future Trends
The cloud and cyber threat landscape is constantly changing. This study will not
always be able to face new threat actors targeting Amazon Web Service resources.
Hence, the future trends section suggests how evolving menaces may be handled,
protecting confidentiality, integrity, and availability stored in the cloud.
Extending Playbooks and Runbooks
During the project, I developed a blueprint that can be used for both ransomware
scenarios targeting other AWS resources or even other threats that can impact vir-
tual or tangible assets. The template includes all the incident response stages with
a brief description highlighting its requirements. It can be seen as a framework
for evaluating and creating incident response plans for various security breaches,
strengthening the business across multiple fronts.
Integrating into Threat Modeling
As suggested by AWS Playbook Framework, the evolution of the incident response
need not conclude with creating playbooks and runbooks [54]. Integrating these
resources into the threat modeling process may offer a proactive approach. Threat
modeling identifies and assesses potential security risks in systems, applications, and
operations. Integrating it with security playbooks and runbooks means not only de-
tecting weaknesses but also planning how to respond to potential incidents that
could arise due to those vulnerabilities. This approach ensures an efficient incident
response since playbooks and runbooks are directly aligned with the specific threats
explored. Furthermore, teams are more coordinated and trained to understand the
situation and act quickly to solve the issue. The integration benefits the minimized
impact because security events can be immediately discovered and mitigated, reduc-
ing downtime, data loss, and reputational damage.
7.3 FUTURE TRENDS 105
Combining Cyber Threat Intelligence and Incident Response
As presented by the research article "Healthcare Security Incident Response Strategy
- A Proactive Incident Response (IR) Procedure", Cyber Threat Intelligence (CTI)
collects information about threat actors, such as attack vectors, victims, and courses
of action, and can dramatically upgrade the incident response procedure [55]. In-
deed, CTI can receive real-time information about menaces and directly trigger the
incident response tools, which can detect anomalies promptly and execute auto-
mated actions when necessary. The authors focused on healthcare organizations
but also punctuated the possibility of integrating their CTI-IR process with other
security products or operations, such as SIEM, SOAR, or SOC. Once the strategy
has been established and the team is well-equipped to handle emergencies, enter-
prises can update their playbooks and runbooks to include CTI technologies.
In closing, this thesis illustrated the collaborative efforts required to safeguard dig-
ital assets in an evolving environment targeted by more and more cyber threats.
By innovating the incident response protocols, adapting them to the dynamic cloud
environment, and providing the business with strengthened resources, we are con-
tributing to a more secure digital future.
References
[1] V. Sumina, 26 Cloud Computing Statistics, Facts & Trends for 2023, en, Jul.
2021. [Online]. Available: https://www.cloudwards.net/cloud-computing-
statistics/.
[2] OCloud Solutions, Impacts of Cloud Computing on Businesses | LinkedIn, May
2022. [Online]. Available: https://www.linkedin.com/pulse/impacts-
cloud-computing-businesses-ocloud-solutions/.
[3] Amazon Web Services, Shared Responsibility Model - Amazon Web Services
(AWS), en-US, 2023.
[4] “2023 Cloud Security Report”, en, Checkpoint, Tech. Rep., Feb. 2023.
[5] A. Today, Impact of Ransomware Attacks on Businesses and Individuals, en,
Feb. 2023. [Online]. Available: https://infosecwriteups.com/impact-of-
ransomware-attacks-on-businesses-and-individuals-cc6b35620887.
[6] N. James, 10 of the Biggest Ransomware Attacks in History, en-US, Section:
Security Audit, Feb. 2023. [Online]. Available: https://www.getastra.com/
blog/security-audit/biggest-ransomware-attacks/.
[7] “Zscaler ThreatLabz 2023 Ransomware Report”, en, ThreatLabz Zscaler, Tech.
Rep., 2023.
[8] N. H. Ab Rahman, N. D. W. Cahyani, and K.-K. R. Choo, “Cloud incident
handling and forensic-by-design: Cloud storage as a case study”, eng, Concur-
REFERENCES 107
rency and computation, vol. 29, no. 14, 2017, Place: Hoboken Publisher: Wiley
Subscription Services, Inc, issn: 1532-0626.
[9] P. Cichonski, T. Millar, T. Grance, and K. Scarfone, “Computer Security In-
cident Handling Guide : Recommendations of the National Institute of Stan-
dards and Technology”, en, National Institute of Standards and Technology,
Tech. Rep. NIST SP 800-61r2, Aug. 2012, NIST SP 800–61r2. doi: 10.6028/
NIST . SP . 800 - 61r2. [Online]. Available: https : / / nvlpubs . nist . gov /
nistpubs/SpecialPublications/NIST.SP.800-61r2.pdf.
[10] A. T. Tunggal, What is an Incident Response Plan?, en, Apr. 2023. [Online].
Available: https://www.upguard.com/blog/incident-response-plan.
[11] National Institute of Standards and Technology, “Framework for Improving
Critical Infrastructure Cybersecurity, Version 1.1”, en, National Institute of
Standards and Technology, Gaithersburg, MD, Tech. Rep. NIST CSWP 041620
18, Apr. 2018. doi: 10.6028/NIST.CSWP.04162018. [Online]. Available: http:
//nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.04162018.pdf.
[12] Amazon Web Services, Bynder Case Study, en-US, 2023. [Online]. Available:
https://aws.amazon.com/solutions/case-studies/bynder/.
[13] S. Mitropoulos, D. Patsos, and C. Douligeris, “On Incident Handling and Re-
sponse: A state-of-the-art approach”, en, Computers & Security, vol. 25, no. 5,
pp. 351–370, Jul. 2006, issn: 01674048. doi: 10.1016/j.cose.2005.09.006.
[Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/
S0167404805001574.
[14] M. Souppaya and K. Scarfone, “Guide to Malware Incident Prevention and
Handling for Desktops and Laptops”, en, National Institute of Standards and
Technology, Tech. Rep. NIST SP 800-83r1, Jul. 2013, NIST SP 800–83r1. doi:
REFERENCES 108
10.6028/NIST.SP.800-83r1. [Online]. Available: https://nvlpubs.nist.
gov/nistpubs/SpecialPublications/NIST.SP.800-83r1.pdf.
[15] Art. 33 GDPR – Notification of a personal data breach to the supervisory
authority, en-US, Apr. 2016. [Online]. Available: https://gdpr-info.eu/
art-33-gdpr/.
[16] P. Kral, Incident Handler’s Handbook, en, Dec. 2011.
[17] J. Azar,Master the Six Phases of Incident Response, en, Webinar, Online, Mar.
2023. [Online]. Available: https://thehacker.news/incident-response-
phases?source=upcoming.
[18] A. Wolter, The Need-to-know principle, en, Feb. 2021. [Online]. Available:
https://techcommunity.microsoft.com/t5/azure-sql-blog/security-
the-need-to-know-principle/ba-p/2112393.
[19] K. Amoresano and B. Yankson, “Human Error - A Critical Contributing Factor
to the Rise in Data Breaches: A Case Study of Higher Education”, Holistica
– Journal of Business and Public Administration, vol. 14, no. 1, pp. 110–132,
2023. doi: doi:10.2478/hjbpa- 2023- 0007. [Online]. Available: https:
//doi.org/10.2478/hjbpa-2023-0007.
[20] E. Segal, ‘alert fatigue’ can lead to missed cyber threats and staff retention/re-
cruitment issues: Study, en, Forbes, Nov. 2021.
[21] S. Gatlan, CISA releases cybersecurity response plans for federal agencies, en-
us, Nov. 2021. [Online]. Available: https://www.bleepingcomputer.com/
news/security/cisa-releases-cybersecurity-response-plans-for-
federal-agencies/.
[22] J. R. Biden Jr., Executive Order on Improving the Nation’s Cybersecurity,
en-US, May 2021. [Online]. Available: https : / / www . whitehouse . gov /
REFERENCES 109
briefing-room/presidential-actions/2021/05/12/executive-order-
on-improving-the-nations-cybersecurity/.
[23] N. Case and P. Hawkins, AWS re:Invent 2019: Prepare for & respond to se-
curity incidents in your AWS environment (SEC356), en, Dec. 2019. [Online].
Available: https://youtu.be/8uiO0Z5meCs?t=2097&si=6tu6DC9dKsvdW69B.
[24] A. Shaked, Y. Cherdantseva, P. Burnap, and P. Maynard, “Operations-informed
Incident Response Playbooks”, Computers & Security, p. 103 454, Aug. 2023,
issn: 0167-4048. doi: https://doi.org/10.1016/j.cose.2023.103454.
[Online]. Available: https://www.sciencedirect.com/science/article/
pii/S0167404823003644.
[25] Amazon Web Services, AWS Well-Architected Framework, en, 2023. [Online].
Available: https://docs.aws.amazon.com/wellarchitected/latest/
framework/welcome.html.
[26] Amazon Web Services, AWS Customer Playbook Framework, Sep. 2023. [On-
line]. Available: https : / / github . com / aws - samples / aws - customer -
playbook-framework.
[27] P. Serrador and J. K. Pinto, “Does Agile work? — A quantitative analysis
of agile project success”, en, International Journal of Project Management,
vol. 33, no. 5, pp. 1040–1051, Jul. 2015, issn: 02637863. doi: 10 . 1016 /
j . ijproman . 2015 . 01 . 006. [Online]. Available: https : / / linkinghub .
elsevier.com/retrieve/pii/S0263786315000071.
[28] Joint Task Force Transformation Initiative, “Guide for conducting risk assess-
ments”, en, National Institute of Standards and Technology, Gaithersburg,
MD, Tech. Rep. NIST SP 800-30r1, Sep. 2012, Edition: 0, NIST SP 800–30r1.
doi: 10.6028/NIST.SP.800-30r1. [Online]. Available: https://nvlpubs.
nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-30r1.pdf.
REFERENCES 110
[29] V. Vicente, The Risk Assessment Matrix: What Is It and Why Is It Important?,
en-US, May 2023. [Online]. Available: https://www.auditboard.com/blog/
what-is-a-risk-assessment-matrix/.
[30] F. Richter, Amazon Maintains Lead in the Cloud Market, en, Aug. 2023. [On-
line]. Available: https://www.statista.com/chart/18819/worldwide-
market-share-of-leading-cloud-infrastructure-service-providers.
[31] V. Engström, P. Johnson, R. Lagerström, E. Ringdahl, and M. Wällstedt,
“Automated Security Assessments of Amazon Web Services Environments”,
ACM Trans. Priv. Secur., vol. 26, no. 2, Mar. 2023, Place: New York, NY,
USA Publisher: Association for Computing Machinery, issn: 2471-2566. doi:
10.1145/3570903. [Online]. Available: https://doi.org/10.1145/3570903.
[32] S. Kepil, The Top 8 AWS Security Risks: What You Need to Know, en, Oct.
2022. [Online]. Available: https://medium.com/@serdalkepil/the-top-8-
aws-cloud-security-risks-what-you-need-to-know-786b3144fee8.
[33] E. Chickowski, Leaky Buckets: 10 Worst Amazon S3 Breaches, en, Bitdefender
Blog, Jan. 2018.
[34] G. Rosenthal, Common RDS Misconfigurations That Can Damage Your Cloud
Data Security Posture, en, 2023. [Online]. Available: https://www.eureka.
security / post / common - rds - misconfigurations - that - can - damage -
your-cloud-data-security-posture.
[35] A. Meridian, Incident response planning, en-US, Dec. 2021. [Online]. Available:
https://www.twitch.tv/videos/1235087932.
[36] A. S. George, A. S. H. George, T. Baskar, and D. Pandey, “XDR: The Evolu-
tion of Endpoint Security Solutions - Superior Extensibility and Analytics to
Satisfy the Organizational Needs of the Future”, en, International Journal of
REFERENCES 111
Advanced Research in Science, Communication and Technology, pp. 493–501,
Aug. 2021, issn: 2581-9429. doi: 10.48175/IJARSCT-1888.
[37] Z. Ahmad, A. Shahid Khan, C. Wai Shiang, J. Abdullah, and F. Ahmad,
“Network intrusion detection system: A systematic study of machine learning
and deep learning approaches”, en, Transactions on Emerging Telecommuni-
cations Technologies, vol. 32, no. 1, e4150, Sep. 2020, issn: 2161-3915. doi:
10.1002/ett.4150.
[38] V. Tzvetkov, How to perform automated incident response in a multi-account
environment, en-US, Section: Advanced (300), Jun. 2020. [Online]. Available:
https://aws.amazon.com/blogs/security/how-to-perform-automated-
incident-response-multi-account-environment/.
[39] Amazon Web Services, What is AWS Security Hub?, en, 2023. [Online]. Avail-
able: https://docs.aws.amazon.com/securityhub/latest/userguide/
what-is-securityhub.html.
[40] M. Ramesh, Use IAM Access Analyzer to generate IAM policies based on access
activity found in your organization trail | AWS Security Blog, en-US, Section:
Amazon Simple Storage Service (S3), Aug. 2021. [Online]. Available: https:
//aws.amazon.com/blogs/security/use- iam- access- analyzer- to-
generate-iam-policies-based-on-access-activity-found-in-your-
organization-trail/.
[41] AWS Public Sector Blog Team, Assess your security posture to identify and
remediate security gaps susceptible to ransomware, en-US, Section: Announce-
ments, Aug. 2020. [Online]. Available: https://aws.amazon.com/blogs/
publicsector/assess-your-security-posture-identify-remediate-
security-gaps-ransomware/.
REFERENCES 112
[42] J. Haggerty and T. Hughes-Roberts, “Visualization of System Log Files for
Post-incident Analysis and Response”, in Human Aspects of Information Secu-
rity, Privacy, and Trust, T. Tryfonas and I. Askoxylakis, Eds., Cham: Springer
International Publishing, 2014, pp. 23–32, isbn: 978-3-319-07620-1.
[43] A. Meridian, Log analysis with Athena Bootstrap, en-US, Dec. 2021. [Online].
Available: https://www.twitch.tv/videos/1235087932.
[44] J. McKiddy, Introducing Assisted Log Enabler for AWS, en-US, Section: Ama-
zon Simple Storage Service (S3), May 2021. [Online]. Available: https://aws.
amazon.com/blogs/opensource/introducing-assisted-log-enabler-
for-aws/.
[45] Amazon Web Services, “Securing your AWS Cloud environment from ran-
somware”, en-US, Amazon Web Services, Tech. Rep., Apr. 2020. [Online].
Available: https://d1.awsstatic.com/WWPS/pdf/AWSPS_ransomware_
ebook_Apr-2020.pdf.
[46] B. Dispensa, Ransomware mitigation: Top 5 protections and recovery prepara-
tion actions, en-US, Section: Intermediate (200), Sep. 2021. [Online]. Available:
https://aws.amazon.com/blogs/security/ransomware-mitigation-top-
5-protections-and-recovery-preparation-actions/.
[47] M. O’Neil, K. Dickinson, and K. Ram, The anatomy of ransomware event
targeting data residing in Amazon S3, en-US, Section: Intermediate (200),
Feb. 2023. [Online]. Available: https://aws.amazon.com/blogs/security/
anatomy-of-a-ransomware-event-targeting-data-in-amazon-s3/.
[48] M. Jones and D. Martinez, Detect threats to your data stored in RDS databases
by using GuardDuty, en-US, Section: Advanced (300), May 2023. [Online].
Available: https://aws.amazon.com/blogs/security/detect-threats-
to-your-data-stored-in-rds-databases-by-using-guardduty/.
REFERENCES 113
[49] R. Warren and J. Miller, Investigate VPC flow with Amazon Detective, en-
US, Section: Advanced (300), Nov. 2020. [Online]. Available: https://aws.
amazon.com/blogs/security/investigate- vpc- flow- with- amazon-
detective/.
[50] Amazon Web Services, Controlling permissions for temporary security cre-
dentials - AWS Identity and Access Management, 2023. [Online]. Available:
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_
temp_control-access.html.
[51] Amazon Web Services, Amazon RDS Backup & Restore using AWS Backup,
en-US, 2023. [Online]. Available: https : / / aws . amazon . com / getting -
started/hands-on/amazon-rds-backup-restore-using-aws-backup/.
[52] G. N. Angafor, I. Yevseyeva, and Y. He, “Game-based learning: A review of
tabletop exercises for cybersecurity incident response training”, en, SECU-
RITY AND PRIVACY, vol. 3, no. 6, e126, Jul. 2020, issn: 2475-6725, 2475-
6725. doi: 10.1002/spy2.126.
[53] Amazon Web Services, AWS CloudSaga - Simulate security events in AWS,
Jul. 2022. [Online]. Available: https://github.com/awslabs/aws-cloudsaga.
[54] Amazon Web Services, Responding to Ransom Attacks within AWS, en, Feb.
2022. [Online]. Available: https://github.com/aws-samples/aws-customer-
playbook-framework/blob/main/docs/Responding_to_Ransom_in_AWS.
md.
[55] Y. He, L. Maglaras, A. Aliyu, and C. Luo, “Healthcare Security Incident Re-
sponse Strategy - A Proactive Incident Response (IR) Procedure”, English,
Security and Communication Networks, vol. 2022, Feb. 2022, Place: London,
United Kingdom Publisher: Hindawi Limited, issn: 19390114. doi: 10.1155/
2022/2775249.