Rechercher Contactez-nous

Suivez-nous sur Twitter

Freely subscribe to our NEWSLETTER

Opinion

Ulf Mattsson, CTO Protegrity: Data Breaches, Compliance and New Approaches for Data Protection in the Cloud

February 2011 by Ulf Mattsson, CTO, Protegrity Corporation

Meeting Payment Card Industry Data Security Standard (PCI DSS) compliance is important, but it is also imperative to understand that compliance does not equal security, as PCI DSS was intended to be the floor, not the ceiling. The emergence of cloud computing has added a new wrinkle to the ongoing efforts to improve data security. This article will discuss the factors that must be considered when securing data in the cloud, and how next-generation tokenization protects data as it flows across systems while minimizing PCI compliance costs.

Data Breaches and the Cloud

The Verizon Business RISK team, in cooperation with the United States Secret Service (USSS), has been conducting an annual Data Breach Investigations Report. The purpose of the report is to study the common elements and characteristics that can be found in data breaches. In six years, the Verizon Business RISK team and USSS combined dataset now spans 900+ breaches and over 900 million compromised records.

As in previous years, the 2010 Report showed that nearly all data was breached from servers and online applications, with 98% of all data breaches coming from servers originating from hacking and malware as the most dominant perpetrators. Financial services, hospitality, and retail comprised the “Big Three” industries, recorded as being 33%, 23%, and 15%, respectively, of all data breaches. Targeting of financial organizations is hardly shocking, as financial records represent the nearest approximation to actual cash for the criminal. An astounding 94% of all compromised records (note: records differ from breaches) in 2009 were attributed to financial services.

Financial firms hold large volumes of sensitive consumer data for long periods of time, and because of this, fall under very stringent government regulation requirements that require them to submit remediation validation records if data is found to be vulnerable, as well as regular compliance reports proving that they are adequately securing the data they have access to. Despite being under such stringent compliance standards, 79% of financial firms whose data had been breached failed to meet PCI DSS compliance, the minimum security measure. Thus, organizations have been searching for a solution that protects the business from endpoint to endpoint, while efficiently meeting compliance.

In addition to the constantly evolving security threats that must be mitigated, enterprises are quickly adopting cloud computing practices that add a new element to the data security conundrum. According to Gartner forecasts, worldwide revenue from use of cloud services will increase nearly 17% this year to $68.3 billion and will approach $150 billion in 2014, a 20.5% compound annual growth rate over the next five years.

While its growing popularity is undeniable, the cloud also has serious data security issues. In the cloud, data moves at a faster pace and frees up on-premise network bandwidth, which is what makes it attractive. Unfortunately, those performing the data breaches recognize the cloud’s vulnerabilities and are quickly capitalizing on them. At DEFCON 2010, one of the largest hacker conferences in the world, 100 attendees who have already hacked or have tried to hack the cloud participated in an in-depth survey. 96% of the participants believed that the cloud would open up more hacking opportunities for them. Given its rapid adoption rate, enterprises need a solution that will secure the cloud today and tomorrow.

Encryption, First Generation and Next Generation Tokenization

Recognizing the vulnerabilities that the cloud faces, we must establish a way to secure data that does not hinder the benefits of the cloud including: remote data access from anywhere with an Internet connection; quick content delivery; easily sharable content; and better version control. Two options that have been used in on-premise data security are becoming a hot debate for what is better to secure data in the cloud: encryption and tokenization. While there is no silver bullet to the data security and compliance woes of large enterprise organizations, all eyes are on tokenization right now.

The difference between end to end encryption and tokenization

End-to-end encryption encrypts sensitive data throughout most of its lifecycle, from capture to disposal, providing a strong protection of individual data fields. While it is a practical approach on the surface, encryption keys are still vulnerable to exposure, which can be very dangerous in the riskier cloud environment. Encryption also lacks versatility, as applications and databases must be able to read specific data type and length in order to decipher the original data. If a database and data length are incompatible, the text will be rendered unreadable.

Tokenization solves many of these problems. At the basic level, tokenization is different from encryption in that it is based on randomness, not on a mathematical formula, meaning it eliminates keys by replacing sensitive data with random tokens to mitigate the chance that thieves can do anything with the data if they get it. The token cannot be discerned or exploited since the only way to get back to the original value is to reference the lookup table that connects the token with the original encrypted value. There is no formula, only a lookup.

A token by definition will look like the original value in data type and length. These properties will enable it to travel inside of applications databases, and other components without modifications, resulting in greatly increased transparency. This will also reduce remediation costs to applications, databases and other components where sensitive data lives, because the tokenized data will match the data type and length of the original.

First generation tokenization

There are compelling arguments that question the validity of this emerging technology like those explained in Ray Zadjmool’s article “Are Tokenization and Data Field Encryption Good for Business?” that appeared in November’s ISSA Journal. Zadjmool pointed out that “some early adopters are quietly discarding their tokenization and data field encryption strategies and returning to more traditional card processing integrations.” He also mentioned that there are no standards to regulate and define exactly what is and is not tokenization. What he failed to do is acknowledge that there are different forms of tokenization. It is no surprise to me that companies that have tried first generation methods have not seen the results that they were promised. Here’s why.

Currently there are two forms of tokenization available: “first generation” and “next generation.” First generation tokenization is available in flavors: dynamic and static.

Dynamic first generation is defined by large lookup tables that assign a token value to the original encrypted sensitive data. These tables grow dynamically as they accept new, un-tokenized sensitive data. Tokens, encrypted sensitive data and other fields that contain ‘administrative’ data expand these tables, increasing the already large footprints.

A variation of first generation tokenization is the pre-populated token lookup table – static first generation. This approach attempts to reduce the overhead of the tokenization process by pre-populating lookup tables with the anticipated combinations of the original sensitive data, thereby eliminating the tokenization process. But because the token lookup tables are pre-populated, they also carry a large footprint.

While these approaches offer great promise, they also introduce great challenges:

• Latency: Large token tables are not mobile. The need to use tokenization throughout the enterprise will introduce latency and thus poor performance and poor scalability.

• Replication: Dynamic token tables must always be synchronized, an expensive and complex process that may eventually lead to collisions. Complex replication requirements impact the ability to scale performance to meet business needs and to deliver high availability.

• Practical limitation on the number of data categories that can be tokenized: Consider the large lookup tables that would be needed to tokenize credit cards for a merchant. Now consider the impact of adding social security numbers, e-mail addresses and any other fields that may be deemed sensitive. The use of dynamic or static first generation tokenization quickly turns into an impractical solution.

Next generation tokenization

Like first generation tokenization, next generation tokenization is built around the same concept of replacing sensitive data with random tokens. However, a key differentiator of next generation tokenization is that it employs small footprint token servers that frees up the process from many of the challenges faced by the first generation tokenization.

Here are the key features of next generation tokenization:

• Distributed: Token servers with small footprints enable the distribution of the tokenization process so that token operations can be executed closer to the data. Thus, latency is eliminated or greatly reduced, depending on the deployment approach used.

• Scalable: The smaller footprint also enables the creation of farms of token servers that are based on inexpensive commodity hardware that create any scaling required by the business, without the need for complex or expensive replication.

• Versatile: Any number of different data categories ranging from credit card numbers to medical records can be tokenized without the penalty of increasing the footprint, and more data types can benefit the transparent properties that tokens offer.

• Increased performance: At Protegrity, we have benchmarked next generation tokenization at approximately 200,000 tokens per second – performance metrics that are hard to achieve with first generation tokenization or encryption.

When next generation tokenization is applied strategically to enterprise applications, confidential data management and PCI audit costs are reduced and the risk of a security breach is minimized. Because authentic primary account numbers (PAN) are only required at authorization and settlement, security is immediately strengthened by the decrease of potential targets for would-be attackers. Simultaneously, PCI compliance costs are significantly decreased because tokenization brings data out of scope, and eliminates the need for annual re-encryption that PCI requires with encryption strategies. Because they all need high availability, high performance, scalability and quick response times that it offers, next tokenization is well suited for financial, retailer, healthcare and telecommunications industries.

As Zadjmool pointed out, standards have yet to be developed for tokenization, but PCI Standards Security Council is in the process of creating guidance and validation documents to help provide clarity on this emerging technology. In the meantime, Visa’s “Best Practices for Tokenization” Version 1.0 which was published on July 14 can provide some clarity until the Council releases its own standards. But be careful because this draft implies a “one-size-fits-all” architectural solution open enough for botched implementations. This includes encryption pretending to be tokenization that lacks security requirements, where random-based tokenization is the only true end-to-end solution.
Conclusion

A holistic solution for data security should be based on centralized data security management that protects sensitive information throughout the entire flow of data across the enterprise, from acquisition to deletion. While no technology can guarantee 100% security, tokenization and encryption are proven to dramatically reduce the risk of credit card data loss and identity theft. Next generation tokenization in particular has the potential to help businesses protect sensitive data in the cloud in a much more efficient and scalable manner, allowing them to lower the costs associated with compliance in ways never before imagined.

1) http://www.verizonbusiness.com/resources/reports/rp_2010-data-breach-report_en_xg.pdf

2) Note: one compromised record is defined as the record of one individual, whereas one data breach is defined as one company’s data being breached

3) http://searchvirtualdatacentre.techtarget.co.uk/news/column/0,294698,sid203_gci1521218,00.html

4) http://www.net-security.org/secworld.php?id=9773

5) http://usa.visa.com/download/merchants/tokenization_best_practices.pdf

Subscribe

Freely subscribe to our NEWSLETTER

See previous articles

See next articles

Security Vulnerability

All our news in english

Alle unsere News auf deutsch

Your podcast Here

All new podcasts

Global Security Mag Copyright 2011