Know Thyself Through Data-Driven Security Q&A
over time, what’s the relationship of one system to another system?”
For their organization, to answer that contextual question, White and Collins say their team has had success leveraging NetFlow data coming off of its Cisco network infrastructure.
“That tells you in summary form who is talking to who and over what port,” White says. “We know, for example, one system that belongs to our ecommerce site, then based on that NetFlow data we can say, ‘OK, well, who does that system talk to? Well it talks to these two app servers and these two app servers talk to these systems and it looks like they’re talking this database language.”
[Are you missing the downsides of big data security analysis? See 3 Inconvenient Truths About Big Data In Security Analysis.]
Putting it together for meaningful answers through metrics
So what does all that correlation and contextualization look like in the real world? According to Collins, it can mean the difference between handing a business unit a report that says it has x amount of vulnerabilities on a laundry list of assets and handing them an enterprise threat readiness report.
“Since we’ve taken in more data, we’ve asked more complicated security questions, we’ve correlated that data and we’ve added this rich context, we’re able say, here’s the different vulnerabilities broken down by insider threat, outsider threat, by regulation , by each individual threat and also going across the columns by the business unit,” he says.
As for security QA, the probing questions are based on what the organization needs to know, not on what data is offered ready-made by a security tool.
For example, they say their organization has asked ‘Which users have the worst security behavior’ and by correlating system configuration information, web proxy events and malware events, they learned that 90 percent of the problems come from 1 percent of the users.
“Which really sets us up to do targeted follow-up security awareness training,” White says.
What’s more, they took that a step further and asked ‘Which users are the riskiest users?’ and tied the answers from the previous question to its application risk catalog and user permissions to see how bad behavior looked across populations of users with access to the highest priority applications.
Like building up muscle through regular exercise, regularly asking and answering difficult security questions hones thought processes about data collection and correlation that can yield creative answers to some of the toughest metrics problems. For example, one of the most ‘intractable’ problems faced by White and plenty of others in the industry is understanding where sensitive data resides in unstructured data stores and who has access to those repositories.
In his organization’s case, answering that question took the use of a Google appliance, pointing it at its systems and configuring it to crawl and index unstructured data so that his team could execute regular expressions against the indexed content.
“You get the uniform resource locater and the filename and type of content found and the number of those records,” he says, explaining that combining that with Active Directory information for user permissions to fileshares or Sharepoint and they can pinpoint who has access to the sensitive information.
As other organization seek to engage in data-driven QA like White and Collins’ organization did, Collins says a real key to the correlation and contextualization process is ensuring that there’s a common language for the data sets. It’s also important to understand who the owners are for every asset and every system.
“It’s great you collect this stuff,” he says, “but if you don’t have anyone you can communicate back to and have them act on it, it’s not really that valuable.”
Have a comment on this story? Please click “Add Your Comment” below. If you’d like to contact Dark Reading’s editors directly, send us a message.