Ulf Mattsson, Protegrity Corp. : A Realistic, Cost-Effective Approach for Securing Data Throughout Its Lifecycle
June 2009 by Ulf Mattsson, CTO, Protegrity Corporation
Data breaches at companies such as payment card processors Heartland Payment Systems and RBS WorldPay as well as retailer Hannaford Bros. all of whom had been certified compliant with the Payment Card Industry Data Security Standard (PCI DSS), have proven the sad truth: compliance is no guarantee that an enterprise won’t suffer a data breach.
The problem is that PCI, like other industry and government security standards, addresses only segments of an enterprise ecosystem. Plus PCI is really meant to be a starting point — basic best practices — not a destination. Consequently PCI has some notable gaps, the most critical being that it currently focuses strongly on encryption for data at rest. While most data is at rest much of the time, securing only stored data leaves data unprotected at the point of acquisition or in transit, and attacks on these points are occurring with increasing frequency.
As Rep. Yvette Clarke (D-N.Y.) said at a recent hearing on data security held in the U.S. House of Representatives, it’s time to “dispel the myth once and for all that PCI compliance is enough to keep a company secure." All enterprises need to move beyond basic regulatory compliance and develop their own customized plans to manage and protect data throughout its entire lifecycle. A risk-based classification process enables businesses to determine their most significant security exposures, target their budgets towards addressing the most critical issues and achieve the right balance between cost and security.
In this interview Protegrity’s CTO Ulf Mattsson discusses the risk-analysis processes that can help companies achieve cost-savings while measurably enhancing their overall data security profile with a holistic plan that protects data from acquisition to deletion.
Can you give me an overview of the risk-based analysis process specifically as it pertains to data security?
Logic tells us that very sensitive data needs to be protected from the moment of capture and throughout its lifecycle, while other types of data will need lesser levels of protection. But some enterprise data security plans center around the "more is better" concept; locking everything down with the strongest available protection which results in unnecessary expenses and frequent availability problems and system performance lags. Alternatively, IT will sometimes shape their data security efforts around the demands of compliance and best practices guidance, and then find themselves struggling with fractured security projects and the never-ending task of staying abreast of regulatory changes.
At Protegrity, we advocate an approach that centers on an analysis of an organization’s unique data risk factors and the use of a risk-adjusted methodology to determine the appropriate data-protection processes, policies and solutions for that organization. Classifying data precisely according to risk levels including the value of that data and the probability of its exposure enables an enterprise to develop a sensible plan to invest budget and efforts where they matter most.
What’s the first step in developing a risk-based data security plan?
You begin by determining the risk profile of all data collected and stored by an enterprise, and then classify that data according to its designated risk level. That may sound complicated, but it’s really just a matter of using common sense. Data that is resalable for a profit — typically financial, personally identifiable and confidential information — is high risk data and requires the most rigorous protection; other data protection levels should be determined according to its value to your organization and the anticipated cost of its exposure — would business processes be impacted? Would it be difficult to manage media coverage and public response to the breach?
One simple way to determine a risk profile is to assign a numeric value for each class of data; high risk = 5, low risk = 1. Use the same values to grade the odds of exposure. Then multiply the data value by the risk of exposure to determine the risk levels in your enterprise.
What is the next step?
Mapping the data flow. You need to locate all of the places data resides including applications, databases, files, data transfers across internal and external networks, etc. The point is to determine where the higher risk data resides and who has or can gain access to it, either internally or via external attacks.
High risk data residing in places where many people can/could access it is obviously data that needs the strongest possible protection. Data flows within the application environment need to be analyzed effectively to determine just how the data is used and where it is truly needed. It is then possible to build a risk profile and begin to systematically address high-risk data flows first, such as databases and applications handling SSNs or payment card data. A risk adjusted methodology will help determining appropriate solutions in each case.
How can an enterprise use a data flow analysis to choose the best, most cost effective security solutions?
By properly understanding the dataflow we can avoid using less cost-effective point solutions and instead implement an enterprise protection strategy. A holistic layered approach to security is far more powerful than the fragmented practices present at too many companies.
Think of your network as a municipal transit system – the system is not just about the station platforms; the tracks, trains, switches and passengers are equally critical components. But many companies approach security as if they are trying to protect the station platforms, and by focusing on this single detail they lose sight of the importance of securing the flow of information.
Implementing a series of point solutions at each protection point will introduce complexity that will ultimately cause significant rework. Protecting each system or data set as part of an overall strategy and system allows the security team to monitor and effectively administer the encryption environment including managing keys and key domains without creating a multi-headed monster that is difficult to control.
So now we’ve classified the data according to its risk level, and we’ve mapped the data flow. What’s the next step?
Conducting an end-to-end risk analysis on the environment. Best practices stress the need for persistent protection of data in and out of the enterprise, and we’ve seen cases where data has been exposed at points in the network where it’s in cleartext form or leaked from a third-party system.
Among other environmental issues to consider is whether data is being captured on a mobile device off the network. With the emergence of popular platforms like the iPhone, there is more and more interest in exploring mobile platforms to perform new business tasks. However, the issue of data privacy always holds organizations back from embracing these kinds of new business models. Fortunately, this is an area where it’s now possible to protect data on mobile platforms using combinations of technologies like Format-Controlling Encryption (FCE) and automated key management solutions. This technique can also be applied to Point of Sale (POS) systems to enable end-to-end protection from the moment the cardholder data is captured.
Other areas to look include your outsourcing partnerships as well as data that’s being used for nonproduction purposes such as third-party marketing analysis, or in test and engineering environments. It’s not uncommon for organizations to invest in protecting production systems and data centers yet have live data sitting unprotected on the systems of application developers and other outsourced parties. If live production data is being used in a less controlled environment there has to be attention paid to regulatory compliance and security threats. Here, too, data de-identification technologies like Format-Controlling Encryption and tokenization can help.
What other issues should an enterprise consider during a risk-based analysis?
You also want to factor in the current favored attack vectors in order to identify the highest risk areas in the enterprise ecosystem. Right now web services, databases and data-in-transit are at high risk. Hacking and malware are currently the attack method of choice among cybercriminals, and they are targeting the application layer and data more than the operating system.
There are two countervailing trends in malware, both likely to continue. One trend is toward the use of highly automated malware that uses basic building blocks and can be easily adapted to identify and exploit new vulnerabilities. This is the malware that exploits unpatched servers, poorly defined firewall rules, the vulnerabilities listed on the OWASP Top Ten, etc. This malware is really aimed at the mass market – small and medium size businesses and consumers.
The other trend is the use of high-end malware which employs the “personal touch” – customization to specific companies’ systems often combined with social engineering to ensure it is installed in all of the right places. Never underestimate the skills and determination of today’s data criminals. It’s important to ensure that the level of due diligence that you apply to data security equals or exceeds that expended by malicious hackers, who are more than willing to work really, really hard to access that data. And bear in mind that these vectors change, keep an eye on security news sites to stay abreast of how criminals are attempting to steal data and adjust your risk analysis accordingly.
Here’s an example of what a risk-based attack vector analysis might look like
What are some of the most effective newer approaches for data security?
Technologies like Format Controlling Encryption can make the process of retrofitting encryption into legacy application environments simpler and it will provide protection while the data fields are in use or transit. This aspect is similar to the tokenization approach, where an alias of the original data points to real data or a secondary database from which the real data can be derived) — you’re moving the problem to another place where it may be more cost effective to address the problem.
Technologies like Transparent Database Field Encryption can also make the process of retrofitting encryption into legacy application environments a lot simpler but cannot provide a fully transparent protection while the data fields are in use or transit. The data field in transit will be in clear or encrypted into a non transparent binary form.
Technologies like Database activity monitoring in combination with file level encryption of databases can provide appropriate solutions for lower risk data. There is a need to discuss data risk factors and to use a risk adjusted methodology for determining appropriate solutions.
These approaches can be very important in the data warehouse, for example, where aggregation on the clear text values is frequent. Format Controlling Encryption may require 10 times more CPU cycles than Transparent Database Field Encryption during the decrypt and search operations. File level encryption of the database can provide fast search operations since data and index information is in the clear inside the database.
With tokenization or Format Controlling Encryption, high risk data is systematically protected from hackers and criminals over its lifecycle under an auditable and controllable process. At the same time, these techniques solve the challenge of separating duties from employees and administrators who need to manage data but perhaps don’t always need to see live data like SSNs. PCI is a good example of a regulation that specifically calls for this kind of separation of duties.
The key thing to remember here is that one size security solutions are never the best fit. For example, a company that manages dynamic information such as payment transactions or customer data used in billing systems will have data that is almost constantly in use, it is rare to find databases full of transaction data "offline”. Obviously this is an issue for retailers and companies that process payments, but in a world that is trending to "always on" network-based services, and criminals who have discovered that data theft is a highly profitable recession-proof business, any good risk-based data protection strategy has to provide end-to-end security for sensitive data.
Is complete end-to-end data security really a reasonable goal in today’s distributed business environment?
The strongest protection for high risk data will invariably be end-to-end encryption (or tokenization) of individual data fields. But concerns about performance degradation, invasiveness, application support, and how to manage broad and heterogeneous database encryption implementations too often produce hard barriers to adopting this important security measure. File level database encryption makes these database encryption challenges a thing of the past by the availability of enterprise level database encryption solutions for heterogeneous environments, complete transparency to databases and applications, moderate separation of duties and centralized management. File level database encryption has been proven to be fairly straight forward to deploy and with minimal impact on performance overhead, while providing convenient key management.
But true end-to-end protection may be extremely complicated to achieve in a complex, multi-entity environment. In this scenario, you’d benefit greatly by using the risk-based approach to select the parts of the data flow that will need stronger protection. The risk levels here will depend on value of the data, data volumes, the servers, connectivity, physical security, HR aspects, geography, compensating controls and other issues. You’d then design and deploy an approach that utilizes risk-adjusted appropriate layers of data security.
How can a company choose the most cost-effective solutions for managing data security?
Although it’s always admirable to get the most for less, it’s important to keep the return on data security investments in perspective. A recent report by the Ponemon Institute, a privacy and information management research firm, found that data breach incidents cost $202 per compromised record in 2008, with an average total per-incident costs of $6.65 million.
All security spend figures produced by government and private research firms indicate that enterprises can put strong security into place for about 10% the average cost of a breach. You can find the right balance between cost and security by doing a risk analysis. For example field level encryption with good key management may lower the probability of card exposure (for example from 2% to 1% for a given year). Using a breach cost of $200 per card, if 1 million cards would be exposed, an appropriate investment in a file protection solution with an integrated and sophisticated key management and protection system would be about $2 million.
When you’re crunching the numbers its also important to remember that risk-based prioritization alone offers many cost savings that also result in an enhanced enterprise security profile. For example, a comprehensive understanding of where all the sensitive data resides usually results in a project to reduce the number of places where critical data is stored, reducing the number of necessary protection points and resulting in better security (less data scattered around the enterprise ecosystem) and a reduced investment in data protection costs.
Risk-based data security plans also eliminate the all too common and costly triage security model which is ineffective whether you’re triaging based on compliance needs or the security threat of the moment. Replacing triage with a customized, thought-out logical plan that takes into account long range costs and benefits enables enterprises to target their budgets towards addressing the most critical issues.
And by switching your focus to a holistic view rather than the all too common security silo methodology, an enterprise will naturally move away from deploying a series of point solutions at each protection point, which results in redundant costs, invariably creates a multi-headed monster that is difficult to control and introduces complexity that will ultimately cause significant and costly rework.
Is there anything else you’d like to add to the above?
Risk-based analysis is tried and true way to quantify the cost and benefits of an action and is often used by businesses, scientists and governments to understand issues on the macro/micro levels and discover the most effective ways to manage potential threats in such scenarios as natural disasters, environmental conditions, medical procedures and product design. Using risk-based analysis principles to manage data security results in a balanced approach to protecting critical information that delivers enhanced security, reduced costs and labor with the least impact on business processes and the user community.
For further Information:
Risk-Based Classification Process: