Gary Palgon, nuBridges, Inc.: A New Approach to Enterprise Data Security, Tokenisation
February 2010 by Gary Palgon, Vice President of Product Management, nuBridges, Inc.
As enterprises seek to protect data from cybercriminals, internal theft or even accidental loss, encryption and key management have become increasingly important and proven weapons in the security arsenal for data stored in databases, files and applications, and for data in transit. No one needs to be reminded of the many high-profile, reputation-damaging and costly data breaches that organisations across industries and governments have suffered over the past few years.
To protect consumers, organisations like the Payment Card Industry have instituted security mandates such as the Data Security Standard (PCI DSS), and governments have passed privacy laws. While these mandates and laws require companies to take certain steps to protect consumer and patient information such as credit card numbers and various types of Personally Identifiable Information (PII), CISOs are also faced with protecting company-confidential information ranging from employee information to intellectual property. Most always, this means finding the best way to secure many types of data stored on a variety of hardware, from mobile devices to desktops, servers and mainframes, and in many different applications and databases. Further, as some companies have learned the hard way, being compliant doesn’t equate to being secure. Breaches have occurred in companies that had taken the necessary steps to pass PCI DSS compliance audits.
Companies typically rely on strong local encryption to protect data. While effective, it does present some challenges. For example, encrypted data takes more space than unencrypted data. Trying to fit the larger cipher text of a 16-digit credit card number back into the 16-digit field poses a “square peg into a round hole” kind of storage problem with consequences that ripple through the business applications that use the data. Storing encrypted values in place of the original data often requires companies to contract for costly programming modifications to existing applications and databases. What’s more, for businesses that must comply with PCI DSS, any system that contains encrypted card data is “in scope” for PCI DSS compliance and audits. Every in-scope system adds to the cost and complexity of compliance.
To reduce the points of risk as well as the scope of PCI DSS audits, and to provide another level of security, a new data security model—tokenisation—is gaining traction with CISOs who need to protect all manner of confidential information in an IT environment.
What is Tokenisation?
With traditional encryption, when a database or application needs to store sensitive data such as credit card or national insurance numbers, those values are encrypted and then the cipher text is returned to the original location. With tokenisation, a token—or surrogate value—is returned and stored in place of the original data. The token is a reference to the actual cipher text, which is usually stored in a central data vault. This token can then be safely used by any file, application, database or backup medium throughout the organisation, thus minimizing the risk of exposing the actual sensitive data. Because you can control the format of the token, and because the token is consistent for all instances of a particular sensitive data value, your business and analytical applications continue seamless operation.
Tokenisation is an alternative data protection architecture that is ideal for some organisations’ requirements. It reduces the number of points where sensitive data is stored within an enterprise, making it easier to manage and more secure. It’s much like storing all of the Queens’ jewels in the Tower of London. Both are single repositories of important items, well guarded and easily managed.
The newest form of tokenisation, called Format Preserving Tokenisation, creates a token—or surrogate value—that represents and fits precisely in the place of the original data, instead of the larger amount of storage required by encrypted data. Additionally, to maintain some of the business context of the original value, certain portions of the data can be retained within the token that is generated. The encrypted data the token represents is then locked in the central data vault.
Because tokens are not mathematically derived from the original data, they are arguably safer than even exposing encrypted values. A token can be passed around the network between applications, databases and business processes safely, all the while leaving the encrypted data it represents securely stored in the central repository. Authorized applications that need access to encrypted data can only retrieve it with proper credentials and a token issued from a token server, providing an extra layer of protection for sensitive information and preserving storage space at data collection points.
Tokenisation enables organisations to better protect sensitive information throughout the entire enterprise by replacing it with data surrogate tokens. Tokenisation not only addresses the unanticipated complexities introduced by traditional encryption, but can also minimize the number of locations where sensitive data resides given that the cipher text is only stored centrally. Shrinking this footprint can help organisations simplify their operations and reduce the risk of breach.
Replacing encrypted data with tokens also provides a way for organisations to reduce the number of employees who can access sensitive data to minimize the scope of internal data theft risk dramatically. Under the tokenisation model, only authorized employees have access to encrypted data such as customer information; and even fewer employees have access to the clear text, decrypted data.
Tokenisation in an Enterprise
The most effective token servers combine tokenisation with encryption, hashing and masking to deliver an intelligent and flexible data security strategy. Under the tokenisation model, data that needs to be encrypted is passed to the token server where it is encrypted and stored in the central data vault. The token server then issues a token, which is placed into applications or databases where required. When an application or database needs access to the encrypted value, it makes a call to the token server using the token to request the full value.
Referential integrity can introduce problems where various applications (e.g., data warehouses) and databases use the sensitive data values as primary or foreign keys to run queries and to perform data analysis. When the sensitive fields are encrypted, they often impede these operations since, by definition, encryption algorithms generate random encrypted values—this is to say that the same encrypted value (a credit card, for instance) does not always generate the same encrypted value. While there are methods to make it consistent, there are risks associated with removing the ‘randomization’ from encryption. A consistent, format-sensitive token eliminates this issue.
With format preserving tokenisation, the relationship between data and token is preserved—even when encryption keys are rotated. The central data vault contains a single encrypted version of each original plain text field. This is true even when encryption keys change over time, because there is only one instance of the encrypted value in the data silo. This means the returned tokens are always consistent whenever the same data value is encrypted throughout the enterprise. Since the token server maintains a strict one-to-one relationship between the token and data value, tokens can be used as primary and foreign keys and referential integrity can be assured whenever the encrypted field is present across multiple data sets. And since records are only created once for each given data value (and token) within the data vault, storage space requirements are minimized.
Maintaining referential integrity is also useful for complying with European privacy laws that regulate the electronic transfer of social insurance numbers across international borders. Using tokens in place of encrypted values meet the requirement of the law, yet allow for data analysis across borders.
Tokenisation in Practice
There are two scenarios where implementing a token strategy can be beneficial: to reduce the number of places sensitive encrypted data resides; and to reduce the scope of a PCI DSS audit. The hub and spoke model is the same for both. The hub contains three components: a centralized encryption key manager to manage the lifecycle of keys; a token server to encrypt data and generate tokens; and a central data vault to hold the encrypted values, or cipher text. The spokes are the endpoints where sensitive data originates such as point-of-sale terminals in retail stores or the servers in a department, call center or website.
Tokenisation reduces the scope of risk, data storage requirements and changes to applications and databases, while maintaining referential integrity and streamlining the auditing process for regulatory compliance. Suitable to heterogeneous IT environments that use mainframes and distributed systems for back office applications and a variety of endpoints, tokenisation presents a number of benefits to CSOs tasked with protecting all types of confidential information. The higher the volume of data and the more types of sensitive data you collect and protect, the more valuable tokenisation becomes. Fortunately, incorporating tokenisation requires little more than adding a token server and a central data vault. For companies that need to comply with PCI DSS, tokenisation has the added advantage of taking applications, databases and systems out of scope, reducing the complexity and cost of initial compliance and annual audits.
Tokenisation Attributes and Best Practices
Tokenisation provides numerous benefits to organisations that need to protect sensitive and confidential information, often in tandem with traditional encryption. Look for a tokenisation solution with the following attributes:
Reduces Risk — Tokenisation creates a central, protected data vault where sensitive data is encrypted and stored. Format preserving tokens reduce the footprint where sensitive data is located and eliminates points of risk.
No Application Modification — Tokens act as surrogates for sensitive data wherever it resides. They maintain the length and format of the original data so that applications don’t require modification.
Referential Integrity — Tokenisation enforces a strict one-to-one relationship between tokens and data values so to preserve referential integrity whenever an encrypted field is present across multiple applications and data sets. This is useful for complying with European privacy laws that regulate the electronic transfer of social insurance numbers across international borders. Using tokens in place of encrypted values meets the requirement of the law, yet allows for data analysis across borders.
Control and Flexibility — The best tokenisation solutions give IT complete control of the token-generation strategy. For example, the last four digits of the data can be preserved in the token, allowing the token to support many common use-cases.
Streamlines Regulatory Compliance — Tokenisation enables organisations to narrow the scope of systems, applications and processes that need to be audited for compliance with mandates such as PCI DSS.