Information security has become a fixture in the daily headlines, ranging from the latest high-profile data breach; to exotic hacks of USB drives, ICS devices and IOT systems; and new zero-day exploits and attack techniques. While these stories are interesting and help us understand the vulnerabilities and risks that make up the threat landscape, they reflect a frequent bias in the industry towards focusing on the “cool” exploit and detection side of cyber-defense, rather than the more operational response and mitigation side. One of the results of this focus, as reported in a recent SANS study, is that for over 90% of incidents, the time from incident discovery to remediation was 1 hour or longer.
This appears to be changing, however, as new reports shine a spotlight on incident response as both welcome and essential, and now courts are reinforcing that sentiment. While this blog frequently examines examples and trends from the threat landscape, this week we’ll consider the other side of the equation and look at incident response. A comprehensive view of threat management includes people, processes, and tools in a process outlined below.
The investigation phase of incident response takes us a step beyond detection. Starting with security alerts from advanced malware detection platforms, SIEM systems, or manual discovery, security analysts can begin to fight back. Customers tell us that in order to respond, they have well defined processes to add context to the security alerts to understand the who, what, and where of an incident, including whether the targeted systems were successfully infected.
The critical steps for investigation include weeding out of false positives – which can be costly – and determining the extent and severity of the breach. A common limitation, they report, is the varying skill levels and manual processes used during this phase. Without strict standards for collecting and analyzing threats, teams can face inconsistent analysis, irregular conclusions, and human errors from manual entry or copy and pasting. Much of the investigation time is not spent analyzing, but in fact, finding, querying, collecting, and attempting to organize threat information, and the lack of automation in this phase represents a major hurdle for organizations that are attempting to manage thousands of events per month or even per week.
Effective investigation provides the context that enables prioritization, the next phase in the IR process:
“Prioritization of incidents should be based upon organizational impact,” says the SANS Institute Incident Handlers Handbook. “For example a single workstation being non-functional can be considered minor … and data being stolen directly from human resources that contain privileged information as high.”
Prioritization enables security teams with limited resources leverage context to pick and choose which incidents are addressed first or at all. Customers report that without incident context, meaningful prioritization is out of reach, leaving their security teams to processes hundreds of vague “critical alerts”. A common pain point from customers is that each security vendor’s alerts have an independent severity rating systems and are ignorant of the targeted systems and potential impact.
3. Quarantine and Contain
A common complaint from incident response teams centers around quarantine and containment. Our Threat Response customers tell us that their incident response tools and process need to answer the question: “What do I do now?” Best practices at this stage often focus on putting compromised users in a “penalty box” and blocking the exfiltration of data, communication with command and control servers, and the dropping of additional malware from remote hosts. These steps can require considerable time and effort, and there are many potential stumbling blocks in the containment process, including managing groups and permissions through the desktop team, convincing the network team to update firewall and proxies from multiple vendors in multiple geographic locations, and even documenting that the actions were taken.
More recently, customers are seeing increased threats from watering hole attacks and malvertising. Blocking sites that were serving malicious ads is not a one-time action. Security teams must now plan to block a site for a limited period of time, then unblock those specific IPs, domains, and URLs after the attack is cleaned up. Multiply that action across several vendors, multiple devices, and worldwide deployments, then multiply again for each attack, and the complexity of managing the containment process becomes evident.
One customer framed their remediation challenges as a joke: “What’s worse than re-imaging ten systems? Re-imaging ten systems when only one was actually infected.” Incident response teams are grateful to find ways to reduce false positives as a means to drastically slash their workload to reimage systems, engage in piecemeal clean-up, rebuild systems from scratch, or reconstruct systems with data from known good backups. Customers report that as they improve their security practices and reduce false positives, they can remediate less.
The headlines continue to include breaches where automated incident response could have reduced data losses and legal exposure for organizations by speeding investigation and reducing the time to containment. This does not have to be the case if security teams apply structured and automated steps to investigate, prioritize, contain, and remediate threats. Context and automation are essential to managing the flood of events and helping keep organizations out of the headlines.