Ulf Mattsson, CTO, Protegrity Corporation: Get More for Less, Enhance Data Security and Cut Costs
December 2009 by Ulf Mattsson, CTO, Protegrity Corporation
Data security plans often center around the "more is better" concept. These call for locking everything down with the strongest available protection and results in unnecessary expenses and frequent availability problems and system performance lags. Alternatively, IT will sometimes shape their data security efforts around the demands of compliance and best practices guidance, and then find themselves struggling with fractured security projects and the never-ending task of staying abreast of regulatory changes.
There is a better way — a risk-based classification process that enables organizations to determine their most significant security exposures, target their budgets towards addressing the most critical issues and achieve the right balance between cost and security. In this article, I discuss the risk-analysis processes that can help companies achieve cost-savings while measurably enhancing their overall data security profile by implementing a holistic plan that protects data from acquisition to deletion.
This paper will review different options for data protection in an Oracle environment and answer the question “How can IT security professionals provide data protection in the most cost effective manner?” The paper present methods to protect the entire data flow across systems in an enterprise while minimizing the need for cryptographic services. This session will also review some PCI requirements and corresponding approaches to protect the data including secure encryption, robust key management, separation of duties, and auditing.
Payment Card Industry Data Security Standard (PCI DSS)
Encryption is a critical component of cardholder data protection. If an intruder circumvents other network security controls and gains access to encrypted data, without the proper cryptographic keys, the data is unreadable and unusable to that person. Other effective methods of protecting stored data should be considered as potential risk mitigation opportunities. For example, methods for minimizing risk include not storing cardholder data unless absolutely necessary, truncating cardholder data if full PAN(Primary Account Number) is not needed.
Oracle Database Security and PCI DSS
Oracle Database security provides powerful data protection and access control solutions to address PCI-DSS requirements. Oracle Database Vault prevents highly privileged users from accessing the credit card information and helps reduce the risk of insider threats with separation of duty, multi-factor authorization and command rules. Oracle Advanced Security Transparent Data Encryption (TDE) provides the industry’s most advanced database encryption solution, enabling encryption of credit card numbers with complete transparency to the existing application. Oracle Audit Vault consolidates and protects database audit data from across the enterprise. Oracle Audit Vault reports and alerts provide pro-active notification of access to credit card information. Oracle Enterprise Manager provides secure configuration scanning to insure your databases stay configured securely. Oracle Label Security extends user security authorizations to help enforce the need-to-know principle. Here are examples of some powerful general functionality that Oracle provides to address different data security requirements, including PCI requirements:
To make Data Un-readable
1. Data Masking
2. Column Level Encryption
3. Table Level Encryption
4. Database backup Encryption
5. Network Traffic Encryption
1. Access Control with column filtering
2. Segregation of Duties - Protect Data Access from DBA and Privileged Users
3. Maintain identity of application users in access control
4. Multi-factor authorization - approved subnets, authentication methods and time based constraints
5. Row Level Access Control
6.Multi-level security (MLS) & Mandatory Access Control (MAC)
1. A table is accessed between 9 p.m. and 6 a.m. or on Saturday and Sunday.
2. A specific column has been selected or updated.
3. A specific value for this column has been used.
4. Capture identity of application users in the database audit trail
5. An IP address from outside the corporate network is used.
6. Reporting across multiple database brands
PCI DSS - Protect stored cardholder data
Render PAN, at minimum, unreadable anywhere it is stored (including data on portable digital media, backup media, in logs, and data received from or stored by wireless networks) by using any of the following approaches:
1. One-way hashes (hashed indexes)
3. Index tokens and pads, (pads must be securely stored)
4. Strong cryptography with associated key management processes and procedures
Protect stored cardholder data in Oracle Databases
Oracle Advanced Security Transparent Data Encryption (TDE) can be used to encrypt the number on media and backup. Optionally TDE can be used with Oracle RMAN to encrypt the entire backup when backed up to disk. Oracle Secure Backup provides a solution for backing up and encrypting directly to tape storage. Encryption algorithms supported include AES and 3DES with 128, 192 (default), or 256 bit key length. Oracle Advanced Security Transparent Data Encryption (TDE) has key management built-in. Encrypted column data stays encrypted in the data files, undo logs, and redo logs, as well as in the buffer cache of the system global area (SGA). SHA-1 and MD5 are used for integrity.
Oracle Advanced Security Transparent Data Encryption (TDE) keys are stored in the database and encrypted using a separate master key that is stored in the Oracle Wallet, a PKCS#12 file on the operating system. The Oracle Wallet is encrypted using the wallet password; in order to open the wallet from within the database requires the ’alter system’ privilege. Oracle Database Vault command rules can be implemented to further restrict who, when, and where the ’alter system’ privilege can be executed. Please see ‘Oracle Implementation - Sample code’ for some implementation examples.
Protection of the Oracle encryption keys
PCI DSS require protect of encryption keys used for encryption of cardholder data against both disclosure and misuse.Oracle Advanced Security Transparent Data Encryption (TDE) keys are stored in the database and encrypted using a separate master key that is stored in the Oracle Wallet, a PKCS#12 file on the operating system. The Oracle Wallet is encrypted using the wallet password; in order to open the wallet from within the database requires the ’alter system’ privilege. Oracle Database Vault command rules can be implemented to further restrict who, when, and where the ’alter system’ privilege can be executed. Oracle Database 11g TDE integrates with PKCS#11 compliant hardware vendors for centralized master key generation and management. Oracle Advanced Security Transparent Data Encryption (TDE) uses a FIPS certified RNG (random number generator); master key can also be generated using certificates or PKI key pairs. Please review ‘Key Management with Oracle 11g TDE using PKCS#11‘for more information about this topic.
The Risk Based Data Classification Process
Step 1: Determine Data Risk Classification Levels _The first step in developing a risk-based data security management plan is to determine the risk profile of all relevant data collected and stored by the enterprise, and then classify data according to its designated risk level. Sounds complicated, but it’s really just a matter of using common sense. Data that is resalable for a profit — typically financial, personally identifiable and confidential information — is high risk data and requires the most rigorous protection; other data protection levels should be determined according to its value to your organization and the anticipated cost of its exposure — would business processes be impacted? Would it be difficult to manage media coverage and public response to the breach? Then assign a numeric value for each class of data; high risk = 5, low risk = 1. Classifying data precisely according to risk levels enables you to develop a sensible plan to invest budget and efforts where they matter most.
Step 2: Map the Data Flow
Data flows through a company, into and out of numerous applications and systems. A complete understanding of this data flow enables an enterprise to implement a cohesive data security strategy that will provide comprehensive protections and easier management resulting in reduced costs.
Begin by locating all the places relevant data resides including applications, databases, files, data transfers across internal and external networks, etc. and determine where the highest-risk data resides and who has or can gain access to it (see ‘attack vectors’ section below). Organizations with robust data classification typically use an automated tool to assist in the discovery of the subject data. Available tools will examine file metadata and content, index the selected files, and reexamine on a periodic basis for changes made. The indexing process provides a complete listing and rapid access to data that meets the defined criteria used in the scanning and classification process. Most often, the indices created for files or data reflect the classification schema of data sensitivity, data type, and geographic region. High risk data residing in places where many people can/could access it is obviously data that needs the strongest possible protection.
When the classification schema is linked to the retention policy, as described above, retention action can be taken based on file indices. Additionally, the reports based on the indices can be used to track the effectiveness of the data retention program.
While we’re discussing data retention policies, it’s important to remember that data disposal also needs to be a secure process; usually you’ll opt to delete, truncate or hash the data the enterprise no longer needs to retain. Truncation will discard part of the input field. These approaches can be used to reduce the cost of securing data fields in situations where you do not need the data to do business and you never need the original data back again. It is a major business decision to destroy, truncate or hash the data. Your business can never get that data back again and it may be more cost effective to transparently encrypt the data and not impact current or future business processes. In addition, the sensitive data may still be exposed in your data flow and logs prior to any deletion or truncation step.
Hash algorithms are one-way functions that turn a message into a fingerprint, at least twenty bytes long binary string to limit the risk for collisions. PCI DSS provided standards for strong encryption keys and key management but is vague in different points regarding hashing. Hashing can be used to secure data fields in situations where you do not need the data to do business and you never need the original data back again. Unfortunately a hash will be non-transparent to applications and database schemas since it will require long binary data type string. An attacker can easily build a (rainbow) table to expose the relation between hash values and real credit card numbers if the solution is not based on HMAC and a rigorous key management system. Salting of the hash can also be used if data is not needed for analytics.
Done properly, data classification begins with categorization of the sensitivity of data (i.e., “public,”“sensitive,” “confidential,” etc). Classification goes on to include the type of data being classified, for example, “sensitive, marketing program,” and where applicable, the countries to which the data classification applies. The classification allows the organization to automate the routines for flagging, removing, or archiving applicable data. Pay particular attention when automating the removal of data; consider instead alerting the user privileges of data requiring attention.
Additionally, an understanding of where all the sensitive data resides usually results in a project to reduce the number of places where the sensitive is stored. Once the number of protection points has been reduced, a project to encrypt the remaining sensitive data with a comprehensive data protection solution provides the best protection while also giving the business the flexibility it needs, and requires a reduced investment in data protection costs.
Step 3: Understand Attack Vectors (Know Your Enemy)
Use your data risk classification plan and the data flow map, along with a good understanding of criminals favored attack vectors, to identify the highest risk areas in the enterprise ecosystem. Currently web services, databases and data-in-transit are at high risk. The type of asset compromised most frequently is online data, not offline data on laptops, back-up tapes, and other media. Hacking and malware proved to be the attack method of choice among cybercriminals, targeting the application layer and data more than the operating system. But these vectors change so keep an eye on security news sites to stay abreast of how criminals are attempting to steal data.
There are two countervailing trends in malware, both likely to continue. One trend is toward the use of highly automated malware that uses basic building blocks and can be easily adapted to identify and exploit new vulnerabilities. This is the malware that exploits unpatched servers, poorly defined firewall rules, the OWASP top ten, etc. This malware is really aimed as the mass market – SMEs and consumers. The other trend is the use of high-end malware which employs the “personal touch” – customization to specific companies, often combined with social engineering to ensure it’s installed in the right systems. This is the type of malware that got TJX, Hannaford, and now Heartland according to a recent report published on KnowPCI (http://www.knowpci.com.) The point is: the more we create concentrations of valuable data, the more worthwhile it is for malware manufacturers to put the effort into customizing a “campaign” to go after specific targets. So, if you are charged with securing an enterprise system that is a prime target (or partner with/outsource to a business that is a major target) you need to ensure that the level of due diligence that you apply to data security equals or exceeds that expended by malicious hackers, who are more than willing to work really, really hard to access that data.
Reports about recent data breaches paint an ugly picture. In mid-March Heartland Security Systems has yet, they claim, to be able to determine exactly how many records were compromised in the breach that gave attackers access to Heartland’s systems, used to process 100 million payment card transactions per month for 175,000 merchants. Given the size and sophistication of Heartland’s business—it is one of the top payment-processing companies in the United States—computer-security experts say that a standard, in-the-wild computer worm or Trojan is unlikely to be responsible for the data breach. Heartland spokespeople have said publicly that the company believes that the break-in could be part of a "widespread global cyber fraud operation."
Step 4: Chose Cost-Effective Protections
Cost-cutting is typically accomplished in one of two ways: reducing quality or by getting the most out of a business’ investment. Assuming you’ve wisely opted for the latter, look for multi-tasking solutions that protect data according to its risk classification levels, supports business processes, and is able to be change with the environment so that you can easily add new defenses for future threats and integrate it with other systems as necessary. Concerns about performance degradation, invasiveness, application support, and how to manage broad and heterogeneous database encryption implementations too often produce hard barriers to adopting this important security measure.
Some aspects to consider when evaluating data security solutions for effectiveness and cost-control include:
Access controls and monitoring
The threat from internal sources including administrators will require solutions that go beyond traditional access controls. Effective encryption solutions must provide separation of duties to protect the encryption keys. A centralized solution can also provide the most cost effective strategy for an organization with a heterogeneous environment. Although some of the legal data privacy and security requirements can be met by native DBMS security features, many DBMSes do not offer a comprehensive set of advanced security options; notably, many DBMSes do not have separation of duties, enterprise key management, security assessment, intrusion detection and prevention, data-in-motion encryption, and intelligent auditing capabilities. This approach is suitable for protection of low risk data.
The basic idea behind tokens is that each credit card number that previously resided on an application or database is replaced with a token that references the credit card number. A token can be thought of as a claim check that an authorized user or system can use to obtain the associated credit card number. Rule 3.1 of the PCI standard advises that organizations “Keep cardholder data storage to a minimum.” To do so, organizations must first identify precisely where all payment data is stored. While this may seem simple, for many large enterprises it is a complicated task for a large enterprise the data discovery process can take months of staff time to complete. And then security administrators must determine where to keep payment data and where it shouldn’t be kept. It’s pretty obvious that the fewer repositories housing credit card information, the fewer points of exposure and the lower the cost of encryption and PCI initiatives. In the event of a breach of one of the business applications or databases only the tokens could be accessed, which would be of no value to a would-be attacker. All credit card numbers stored in disparate business applications and databases are removed from those systems and placed in a highly secure, centralized tokenization server that can be protected and monitored utilizing robust encryption technology.
Tokenization is a very hot “buzzword” but it still means many things to different people, and some implementations of it can pose an additional risk relative to mature encryption solutions. Companies are still being required to implement encryption and key management systems to lock down various data across the enterprise, including PII data, transaction logs and temporary storage. A tokenization solution would require a solid encryption and key management system to protect the tokenizer. Organizations use card numbers and PII data in many different places in their business processes and applications that would need to be rewritten to work with the token numbers instead. This approach is suitable for protection of high risk data. Please see the discussion of tokenization in http://papers.ssrn.com/sol3/papers.... .
File level Database Encryption
File level Database Encryption has been proven to be fairly straight forward to deploy and with minimal impact on performance overhead, while providing convenient key management. This approach is cost effective since it installs quickly in a matter of days, utilizes existing server hardware platforms and can easily extend the protection to log files, configuration files and other database output. This approach is the fastest place to decrypt as it is installed just above the file system and encrypts and decrypts data as the database process reads or writes to its database files. This enables cryptographic operations in file system by block chunks instead of individually, row-by-row since the data is decrypted before it is read into the database cache. Subsequent hits of this data in the cache incur no additional overhead. Neither does the solution architecture diminish database index effectiveness, but remember that the index is in clear text and unprotected within the database.
This approach can also selectively encrypt individual files; and does not require that "the entire database" be encrypted. Database administrators can assign one or more tables to a table space file, and then policies can specify which table spaces to encrypt. Therefore, one need only encrypt the database tables that have sensitive data, and leave the other tables unencrypted. However, some organizations choose to encrypt all of their database files because there is little performance penalty and no additional implementation effort in doing so.
Production database requirements often use batch operations to import or export bulk data files. If these files contain sensitive data, they should be encrypted when at rest no matter how short the time they are at rest. (Note: some message queues such as MQ Series write payload data to a file if the message queue is backed up, sometime for a few seconds or up to hours if the downstream network is unavailable) It may be difficult to protect these files with column level encryption solutions. This approach can encrypt while still allowing transparent access to authorized applications and users.
This approach is suitable for protection of low risk data. Be aware of the limitations with this approach in the areas of no separation of DBA duties and potential issues that operating system patches can cause. File encryption doesn’t protect against database-level attacks. How are you going to effectively and easily keep administrators from seeing what they don’t need to see with file-level encryption? Protection of high risk data is discussed below in the sections ‘Field level encryption’ and ‘End-to-end encryption’.
Experience from some organizations has shown that the added performance overhead for this type of database encryption is often less than 5%. However, before deciding on any database file encryption solution, you should test its performance in the only environment that matters: your own.
Field level encryption
Field level full or partial encryption/tokenization can provide cost effective protection of data fields in databases and files. Most applications are not operating on and should not be exposed to all bytes in fields like credit card numbers and social security numbers, and for those that do require full exposure an appropriate security policy with key management and full encryption is fully acceptable. This approach is suitable for protection of high risk data.
Continuous protection via end-to-end encryption at the field level is an approach that safeguards information by cryptographic protection or other field level protection from point-of-creation to point-of deletion to keep sensitive data or data fields locked down across applications, databases, and files - including ETL data loading tools, FTP processes and EDI data transfers. ETL (Extract, Transform, and Load) tools are typically used to load data into a data warehousing environments. This end-to-end encryption may utilize partial encryption of data fields and can be highly cost effective for selected applications like an e-business data flow.
Field level encryption and end-to-end encryption
End-to-end encryption is an elegant solution to a number of messy problems. It’s not perfect; field-level end-to-end encryption can, for example, break some applications, but its benefits in protecting sensitive data far outweigh these correctable issues. But the capability to protect at the point of entry helps ensure that the information will be both properly secured and appropriately accessible when needed at any point in its enterprise information life cycle.
End-to-end data encryption can protect sensitive fields in a multi-tiered data flow from storage all the way to the client requesting the data. The protected data fields may be flowing from legacy back-end databases and applications via a layer of Web services before reaching the client. If required, the sensitive data can be decrypted close to the client after validating the credentials and data-level authorization.
Today PCI requires that if you’re going outside the network, you need to be encrypted, but it doesn’t need to be encrypted internally. If you add end-to-end encryption, it might negate some requirements PCI have today, such as protecting data with monitoring and logging. Maybe you wouldn’t have to do that. So PCI Security Standards Council is looking at that in 2009. Data encryption and auditing/monitoring are both being necessary for a properly secured system - not one vs. the other. There are many protections that a mature database encryption solution can offer today that cannot be had with some of the monitoring solutions that are available. Installing malicious software on internal networks to sniff cardholder data and export it is becoming a more common vector for attack, and by our estimates is the most common vector of massive breaches, including TJX, Hannaford, Heartland and Cardsystems.
Storage-layer encryption or file layer encryption doesn’t provide the comprehensive protection that we need to protect against these attacks. There is a slew of research indicating that advanced attacks against internal data flow (transit, applications, databases and files) is increasing, and many successful attacks were conducted against data that the enterprise did not know was on a particular system. Using lower-level encryption at the SAN/NAS or storage system level can result in questionable PCI compliance, and separation of duties between data management and security management is impossible to achieve. Please see the discussion of end-to-end encryption in http://papers.ssrn.com/sol3/papers.... .
PCI compensating controls are temporary measures you can use while you put an action plan in place. Compensating controls have a “shelf life” and the goal is to facilitate compliance, not obviate it. The effort of implementing, documenting and operating a set of compensating controls may not be cost effective in the long run. This approach is only suitable for temporary protection of low risk data.
Software based encryption
Many businesses also find themselves grappling with the decision between hardware-based and software-based encryption. Vendors selling database encryption appliances have been vociferously hawking their wares as a faster and more-powerful alternative to software database encryption. Many organizations have bought into this hype based on their experiences with hardware-based network encryption technology. The right question would be about the topology or data flow. The topology is crucial. It will dictate performance, scalability, availability, and other very important factors. The topic is important but the question is usually not well understood. Usually, hardware-based encryption is remote and software-based encryption is local but it doesn’t have anything to do with the form factor itself. Instead, it is about where the encryption is happening relative to your servers processing the database information.
Software to encrypt data at the table or column levels within relational database management systems is far more scalable and performs better on most of the platforms in an enterprise, when executing locally on the database server box. Software based encryption combined with an optional low cost HSM for key management operations will provide a cost effective solution that proves to be scalable in an enterprise environment.
The most cost effective solutions can be deployed as software, a soft appliance, a hardware appliance or any combination of the three, depending on security and operational requirements for each system. The ability to deploy a completely "green" solution, coupled with deployment flexibility, make these solution alternatives very cost effective also for shared hosting and virtual server environments. The green solution is not going away. There’s too much at stake.
Step 5: Deployment
Focus initial efforts on hardening the areas that handle critical data and are a high-risk target for attacks. Continue to work your way down the list, securing less critical data and systems with appropriate levels of protection. Be aware though that the conventional “Linked Chain” risk model used in IT security — the system is a chain of events, where the weakest link is found and made stronger — isn’t the complete answer to the problem. There will always be a weakest link. Layers of security including integrated key management, identity management and policy-based enforcement as well as encryption of data throughout its entire lifecycle are essential for a truly secure environment for sensitive data.
It is critical to have a good understanding of the data flow in order to select the optimal protection approach at different points in the enterprise. By properly understanding the dataflow we can avoid less cost effective point solutions and instead implement an enterprise protection strategy. A holistic layered approach to security is far more powerful than the fragmented practices present at too many companies. Think of your network as a municipal transit system – the system is not just about the station platforms; the tracks, trains, switches and passengers are equally critical components. Many companies approach security as if they are trying to protect the station platforms, and by focusing on this single detail they lose sight of the importance of securing the flow of information. It is critical to take time from managing the crisis of the moment to look at the bigger picture. One size doesn’t fit all in security so assess the data flow and risk environment within your company and devise a comprehensive plan to manage information security that dovetails with business needs. Careful analysis of use cases and the associated threats and attack vectors can provide a good starting point in this area.
It is important to note that implementing a series of point solutions at each protection point will introduce complexity that will ultimately cause significant rework. Protecting each system or data set as part of an overall strategy and system allows the security team to monitor and effectively administer the encryption environment including managing keys and key domains without creating a multi-headed monster that is difficult to control.
Centralized management of encryption keys can provide the most cost effective solution for an organization with multiple locations, heterogeneous operating systems and databases. All standards now require rotation of the Data Encryption Keys (DEK’s) annually and some organizations choose to rotate some DEKs more frequently (such as a disconnected terminal outside the corporation firewall such as a Point of Sale system).
Manual key rotation in a point solution would require an individual to deliver and install new keys every year on all the servers. Automated key rotation through a central key management system reduces most of this cost and can potentially reduce the down time. Distributed point solutions for key management would include an initial investment for each platform, integration effort, maintenance and operation of several disparate solutions. It is our experience that manual key rotation in a point solution environment inevitably leads to increased down time, increase resource requirements, and rework. Key management and key rotation is an important enabler for several of the protection methods discussed above. Please see http://papers.ssrn.com/sol3/papers.... for more information on that topic. Centralized management of reporting and alerting can also provide a cost effective solution for an organization with multiple heterogeneous operating systems and databases. This solution should track all activity, including attempts to change security policy, and encrypted logs to ensure evidence-quality auditing. Just as the keys should not be managed by the system and business owners, they should not have access to or control over the reporting and alerting logs and system A system with manual or nonexistent alerting and auditing functionality can increase the risk of undetected breaches and increase audit and reporting costs.
Build vs. buy
Many projects that have made the build vs. buy decision purely based on the misconceived notions from management about one option or the other. This is a decision that requires analysis and insight. Why re-invent the wheel if several vendors already sell what you want to build? Use Build or Buy Analysis to determine whether it is more appropriate to custom build or purchase a product. When comparing costs in the buy or build analysis, include indirect as well as direct costs in both sides of the analysis. For example, the buy side of the analysis should include both the actual out-of-pocket cost to purchase the packaged solution as well as the indirect costs of managing the procurement process. Be sure to look at the entire life-cycle costs for the solutions.
1. Is the additional risk of developing a custom system acceptable?
2. Is there enough money to analyze, design, and develop a custom system?
3. Does the source code have to be owned or controlled?
4. Does the system have to be installed as quickly as possible?
5. Is there a qualified internal team available to analyze, design, and develop a custom system?
6. Is there a qualified internal team available to provide support and maintenance for a custom developed system?
7. Is there a qualified internal team available to provide training on a custom developed system?
8. Is there a qualified internal team available to produce documentation for a custom developed system?
9. Would it be acceptable to change current procedures and processes to fit with the packaged software?
Data Protection in Testing, Staging and Production Environments
In some cases traditional data masking cannot provide the quality of data that is needed in a test environment. Masking can only be used to reduce the cost of securing data fields in situations where you do not need the data to do business and never need the original data back again. There is a need during the development lifecycle to be able to perform high quality test scenarios on production quality test data by reversing the data hiding process. The consistency of data protection tools across an enterprise is a very important strategy for assuring that sensitive data in each environment across the enterprise is properly protected and in compliance with growing regulatory requirements. Encryption and data protection is a strategically important process in an enterprise across Test, Operational and Archive environments. Several of the data protection approaches discussed earlier can assure that sensitive data is protected across development, testing, staging and production environments. These approaches for data protection can also be used in outsourced environments and on virtual servers, when data level protection can be a powerful way of enforcing separation of duties.
Oracle data masking
The Data Masking Pack for Databases helps organizations share production data in compliance with privacy and confidentiality policies by replacing sensitive data with realistic but scrubbed data based on masking rules. There are 2 primary use cases for the Data Masking Pack. First, DBAs want to take a copy of production for testing purposes and use the Data Masking Pack to replace all sensitive data with innocuous but realistic information and then make this database available to developers. Second, organizations want to share production data with 3rd parties while hiding sensitive or personally identifiable information.
Limitations of data masking
1. The data masking process cannot be reversed. This is a one-way process (transformation - scramble, obfuscate ...).
2. Data used for test/development is masked according to different rules.
3. It can be used to de-identify confidential information to protect privacy only in non-production environment.
4. It cannot be generally used for securing data in a production environment.
5. There is a need in the development lifecycle to support required test scenarios by reversing the data protection process to be able to perform a high quality test. A reversible protection process, audit-ability and accountability are important in these situations.
6. If necessary to reload the test database, the data masking rules would need to be applied to the original data again and this could take some time for a large database.
Key Management with Oracle 11g TDE using PKCS#11
Starting with Oracle 10.2, Oracle supports Transparent Data Encryption (TDE). Individual columns may be encrypted, and as the name indicates, this encryption is transparent to the database user. TDE is part of Oracle Advanced Security.
Though TDE encryption is per column, keys are rather per table. All encrypted columns within the same table share the same key. These table keys are stored inside the database in a dictionary table, encrypted with a master key.
The master key, which may either be a symmetric key or an asymmetric key pair, is however not stored within the database, but rather in some external module. This enables separation of duties, but also means a separate backup procedure for the master key is necessary.
In Oracle 10.2 the TDE master key is stored in an Oracle Wallet. This is a password-protected container where different security objects like certificates and private keys may be stored, normally in PKCS#12 format. Principal use for wallets is not really TDE, they are also used for keys and certificates associated with SSL communication.
There are two wallet types. A standard type wallet stores all credentials inside the wallet, whereas a PKCS#11 type wallet keeps some objects in an external PKCS#11 token. In the latter case the token password is stored inside the wallet, enabling access to the external token objects.
Even if Oracle 10.2 uses wallets to protect the TDE master key, only the standard type wallet is supported for TDE. If using a PKCS#11 type, either errors are raised or the objects inside the PKCS#11 token aren’t used. Hence TDE master keys can’t be protected inside an HSM if the 10.2 database version is used.
Starting with Oracle 11.1, HSM support for Oracle TDE has been added. This support doesn’t use a PKCS#11 type wallet, though. It has a separate, non-wallet based implementation. The master key may be stored directly inside an HSM instead of a wallet. The HSM interface used is however still PKCS#11.
Oracle 11.1 has also added support for Tablespace Encryption, which is basically file encryption. It has a separate master key without HSM support. Tablespace encryption is not considered in this paper; it only looks at Oracle TDE key management integration. As such this paper also doesn’t cover access control/audit. A general TDE consideration is that it really is transparent; access control is not part of TDE. Access control is rather accomplished with the Oracle Database Vault feature.
TDE uses its own internal format for encryption. As specified in the 11.1 version of Oracle Advanced Security Administrator’s Guide:
"…Each encrypted value is associated with a 20 byte integrity check. In addition, transparent data encryption pads out encrypted value to 16 bytes. This means that if a credit card number requires 9 bytes for storage, then an encrypted credit card value will require an additional 7 bytes. Also, if data has been encrypted with salt, then each encrypted value requires an additional 16 bytes of storage.
To summarize, encrypting a single column would require between 32 and 48 bytes of additional storage for each row, on average…"
This is a bit cryptic, but assumingly all encrypted values include a SHA-1 checksum. The cleartext + checksum is padded to the block size before encryption. In addition there may also be an IV (salt) value attached to the data. This data format would be similar to DPS encryption, though in Oracle TDE SHA-1 is used instead of CRC-32 for integrity, and integrity is not optional.