This post was coauthored by Diana Kelley, Cybersecurity Field CTO, and Siân John, EMEA Chief Security Advisor, Cybersecurity Solutions Group.
You’ve got a big dinner planned and your dishwasher goes on the fritz. You call the repair company and are lucky enough to get an appointment for that afternoon. The repairperson shows up and says, “Yes, it’s broken, but to figure out why I will need to run some tests.” They start to remove your dishwasher from the outlet. “What are you doing?” you ask. “I’m taking it back to our repair shop for analysis and then repair,” they reply. At this point, you’re annoyed. You have a big party in three hours, and taking the dishwasher all the way back to the shop for analysis means someone will be washing dishes by hand after your party—why not test it right here and right now so it can be fixed on the spot?
Now, imagine the dishwasher is critical business data located throughout your organization. Sending all that data to a centralized location for analysis will give you insights, eventually, but not when you really need it, which is now. In cases where the data is extremely large, you may not be able to move it at all. Instead it makes more sense to bring services and applications to your data. This at the heart of a concept called “data gravity,” described by Dave McCrory back in 2010. Much like a planet, your data has mass, and the bigger that mass, the greater its gravitational pull, or gravity well, and the more likely that apps and services are drawn to it. Gravitational movement is accelerated when bandwidth and latency are at a premium, because the closer you are to something the faster you can process and act on it. This is the big driver of the intelligent cloud/intelligent edge. We bring analytics and compute to connected devices to make use of all the data they collect in near real-time.
But what might not be so obvious is what, if anything, does data gravity have to do with cybersecurity and the security operations center (SOC) of tomorrow. To have that discussion, let’s step back and look at the traditional SOCs, built on security information and event management (SIEM) solutions developed at the turn of the century. The very first SIEM solutions were predominantly focused on log aggregation. Log information from core security tools like firewalls, intrusion detection systems, and anti-virus/malware tools were collected from all over a company and moved to a single repository for processing.
That may not sound super exciting from our current vantage point of 2018, but back in 2000 it was groundbreaking. Admins were struggling with an increasing number of security tools, and the ever-expanding logs from those tools. Early SIEM solutions gave them a way to collect all that data and apply security intelligence and analytics to it. The hope was that if we could gather all relevant security log and reporting data into one place, we could apply rules and quickly gather insights about threats to our systems and security situational awareness. In a way this was anti–data gravity, where data moved to the applications and services rather than vice versa.
After the initial “hype” for SIEM solutions, SOC managers realized a few of their limitations. Trying to write rules for security analytics proved to be quite hard. A minor error in a rule led to high false positives that ate into analyst investigative time. Many companies were unable to get all the critical log data into the SIEM, leading to false negatives and expensive blind spots. And one of the biggest concerns with traditional SIEM was the latency. SIEM solutions were marketed as “real-time” analytics, but once an action was written to a log, collected, sent to the SIEM, and then parsed through the SIEM analytics engine, quite a bit of latency was introduced. When it comes to responding to fast moving cyberthreats, latency is a distinct disadvantage.
Now think about these challenges and add the explosive amounts of data generated today by the cloud and millions of connected devices. In this environment it’s not uncommon that threat campaigns go unnoticed by an overloaded SIEM analytics engine. And many of the signals that do get through are not investigated because the security analysts are overworked. Which brings us back to data gravity.
What was one of the forcing factors for data gravity? Low tolerance for latency. What was the other? Building applications by applying insights and machine learning to data. So how can we build the SOC of tomorrow? By respecting the law of data gravity. If we can perform security analytics close to where the data already is, we can increase the speed of response. This doesn’t mean the end of aggregation. Tomorrow’s SOC will employ a hybrid approach by performing analytics as close to the data mass as possible, and then rolling up insights, as needed, to a larger central SOC repository for additional analysis and insight across different gravity wells.
Does this sound like an intriguing idea? We think so. Being practitioners, though, we most appreciate when great theories can be turned into real-world implementations. Please stay tuned for part 2 of this blog series, where we take the concept of tomorrow’s SOC and data gravity into practice for today.