Published on

How BookMyShow re-encrypted 25 million+ mongoDB documents

Authors

Originially posted at BookMyShow Tech Blog

Most online services have a module to save payment methods that facilitate faster payments resulting in higher turnouts. We call this module, Quikpay. The sensitive customer details such as card numbers, UPI IDs, etc. are kept encrypted.

To obtain the PCI-DSS certification, we stumbled upon this unique situation. We needed to generate new keys and re-encrypt customer data. The challenges involved

  • Revamping of the in-house key management system
  • Developing an application which would safely re-encrypt data without affecting the customer’s experience
  • Standardisation of fields across documents in our mongoDB collection
  • Preventive measures taken to protect keys from being leaked

Revamping of the in-house key management system

In the growing stages of our organization, we built an in-house key management system that fell out of maintenance and was not well-documented. This system was developed with .NET Framework 4.x and the methods were synchronous.

In the age of microservices, it was becoming increasingly important to adopt cross-platform solutions. We decided to convert the codebase to .NET 5 (successor to .NET Core 3) and implement asynchronous code to better serve millions of our users. This felt like an easier jump rather than integrating with AWS KMS at the moment. Read more on our migration journey to AWS cloud.

Under the hood, it uses Advanced Encryption Standard (AES-256).
Some of its notable features are,

  • It is a symmetric-key algorithm, meaning the same key is used for both encrypting and decrypting the data
  • Cipher Block Chaining or CBC is the preferred mode of operation. A block is a sequence of bits on which encryption is applied as one unit. CBC necessitates the use of initialization vector (IV)
  • The block size is 256 bits. Hence, the name AES-256.

We begin with two parameters, key and IV. Two sets of these parameters are used. One to encrypt the data (Data Encryption Keys, or DEKs) and another to encrypt the token reference to that data (Key Encryption Keys, or KEKs).

How card details are encrypted and saved in documents
How card details are encrypted and saced in documents

Let's go through the steps one by one.

  1. The application requests the Key Management System with two parameters.
  • a. Sensitive Payment Data : Suppose card number, 5555 5555 5555 4444
  • b. App Server KEK : The KEKs assigned to each server to authenticate requests at KMS. It is denoted with the yellow key in the diagram.
  1. KEK verifier module authorises requests from various applications
  2. Once authorised, a random token is generated.
  3. The token is mapped to another set of keys (DEKs)
  4. The DEK assigned to the token is picked. It is denoted with the blue key in the diagram.
  5. The card number is encrypted with the DEK.
  6. Encrypted data and token is saved in a MongoDB document.

The decryption process follows a similar pattern. Obtaining the saved token, we search in the pool of DEKs, decrypt the data and send it to the user selected payment method.

This time, to support backward compatibility, we also added versioning. This helped us in taking our changes live in phases.

Developing an application which would safely re-encrypt data without affecting the customer’s experience

Though the certification asserts trust in the industry, we did not want to compromise with the customer experience. We decided to roll out our changes in phases.
First we tested it on all the employee accounts. Then, we divided the entire data, based on recency of usage, into chunks of 3 to 6 months time-duration.

Let’s see what happened at the document level.

Flowchart of processing per document
Processing per document

As you can notice, we loop through a customer’s saved payment modes, do the re-encryption with new keys but just before saving it back we put an additional check. Here, we identify whether the customer is currently modifying her saved payments. If yes, we skip the update and retry it in the next batch.

As you already can guess, it’s not practically feasible to update these documents one at a time. We leveraged asynchronous batch processing at this juncture. The script would pick unprocessed documents based upon version and attempt asynchronous tasks to update them with re-encrypted values.

After multiple tries, we came up with a good enough batch size. Too big a size, the database queue will get overwhelmed; too small a value, the script will take ages to complete its run.

For ease of control, we put checks to pause re-encryption attempts after every batch.

Standardisation of fields across documents in our mongoDB collection

While running the job we got stuck upon yet another issue. The collection has been active for the past 10 years. Over this decade, digital payments in India have undergone myriad changes. Consequently, the data felt like a relic. The script would always skip in case it encountered any legacy field in a document.

We modified the script to “dry run” through the entire data and find out mistakes.

Later we made several accommodations in our main script to correct those erroneous fields on updation implementing custom serialization and deserialization methods.

Preventive measures taken to protect keys from being leaked

On brainstorming with various stakeholders, we decided to keep keys securely stored in a vault whose access will be restricted only to the Security Operations team. The credentials to this vault were added as kubernetes secrets to our script.

The activity ended up not only securing our customers’ data, but also cleaning the way we store the data currently. The revamp of KMS helped us in moving out a few applications from resource-intensive windows servers to kubernetes pods.

Way Forward

You must be thinking how much time the entire thing took? Weeks! Updating documents one by one is not that efficient. It waits till the customer has made his changes and re-encrypts once the purchase is done.

For historical data that has very low probability of being accessed, we ran a variant of bulk updates during times of low-traffic. With newer versions of Mongo, using transactions can also be considered as an option.