« Classic interactive fiction and standards meetings | Main | A very short technical report »

Monday, 04 January 2010

Defining tokenization

Tokenization and encryption are the two technologies that are most commonly used to protect sensitive cardholder data like the PCI DSS requires. Encryption is very well defined and understood, but tokenization isn’t. Exactly what is tokenization? Here’s my definition. You can read more about this, including why I believe that this definition makes sense and how this definition compares to the definition of encryption, in "Defining Tokenization and the Security Provides" this month’s ISSA Journal.

A tokenization scheme comprises two stateful, deterministic algorithms: tokenize and detokenize. These operate on two strings called a plaintext and a token. The tokenize algorithm produces a token from a plaintext. The detokenize algorithm produces a plaintext from a token that has already been created by the tokenize algorithm.

A secure tokenization scheme is one in which the mutual information between a plaintext and the token that the tokenize algorithm creates from it is zero.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e55375ef1c88330120a79c6ac1970b

Listed below are links to weblogs that reference Defining tokenization:

Comments

Greg

Ah. So the state is shared between tokenize and detokenize, and kept secret from the adversary. Thanks, I get it now.

Luther Martin

I'm sure that our sales guys have lots of information about our tokenization product. You can reach them at sales@voltage.com.

Luther Martin

Detokenization is typically done by a database lookup. When a token is created, an encrypted copy of the plaintext is archived along with the token. Then to detokenize, you lookup the ciphertext that corresponds to the token, decrypt the ciphertext, and provide the decrypted plaintext to the requesting application.

Greg

Also, is a detailed description of the Voltage product available?

Greg

Thanks Luther. So in the example when tokenize outputs a random value as the token for an input, how does detokenize work?

Luther Martin

By this definition, a one-up counter would be secure, because the mutual information between the token and the plaintext is zero. A random value would also be secure.

The tokenization product that Voltage sells uses a FIPS-validated PRNG to create tokens. As to what other people are using, that's a tough one. Tokenization vendors typically keep all the workings of their systems proprietary, so it's not at all clear how they create tokens.

Greg

Please give an example of a tokenization function that is (beleived to be) secure by your definition. Can you also give examples of what functions people are using?

Post a comment

If you have a TypeKey or TypePad account, please Sign In.

Voltage Data Breach Index

  • Grab the Voltage Data Breach Index

February 2012

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29