How to secure persistent data in a multitenant kubernetes environment

September 13, 2021

Introduction

Today the world is changing rapidly with processes, application development and integration moving even faster. Workflows and their applications are highly automated and connect directly to private or shared data endpoints over the globe. Data growth, the distribution of data and connecting apps with the right Service Level Agreements (SLA) for availability, reliability and performance requirements is also a challenge in context with this rapid change and digital transformation.

We will see in a practical example on how to secure your data using encryption and external key management near the application, which is independent from the infrastructure behind.

In this and the upcoming blog posts I will give you an overview about use cases of various industry branches where a next generation application as a service and distributed storage is more important than ever before. As an example, it’s also planned to create practical examples with data sharing applications with a distributed object storage backend.

For all use cases I prefer to use an open source software centric approach which is independent from any hardware or cloud provider and driven by a huge community with over 15 million developers.

Use Case: Data security with storage encryption near the application

Why Persistent Storage in OpenShift / Kubernetes?

Red Hat’s OpenShift and OpenShift Data Foundation

In the development of cloud-native applications, there are different strategies to deal with the data to be processed in storage and use. Containers were developed as stateless, short-lived, lightweight tools with low memory consumption to accelerate the start of applications or to start them on demand (event-driven).
By design, however, exiting from a container also means losing the data created in the container. The data was either in memory[1] or in a temporary file system. If data must be preserved even after the container has been cleaned up, the application in the container must connect directly to a storage system via connectors to, for example S3 object storage. If this is not supported by the application, then a local storage is connected via file (CephFS or NFS) or block (iSCSI, FC) through corresponding interfaces (Container Storage Interface – CSI) when deploying a pod/container in Kubernetes. The block devices are usually formatted with xfs or ext4 as the file systems to be used.

The container storage interface ensures that permanently available storage from various manufacturers, such as SDS (software-defined storage) with Red Hat OpenShift Data Foundation or NetApp ONTAP and Trident appliances with predefined storage classes, is provided. We are talking here about storage areas that are mounted in the container per mount point (/data /log /tmpdata …) and are to remain in place for the long term. This method is also a popular option for data portability when moving data by importing, but replication via the application is always preferable because it is less error-prone and enables a more transparent transition or failover scenario. 

Going back one step in container creation, the developers also play a role here, because tests regarding performance and SLAs must also be taken into account, and persistent storage with high availability and corresponding performance reduces development effort and improves quality assurance in the CI/CD pipeline. Development can be close to production, which enables fast release cycles and continuous redeployment on a regular, fully automated basis. Moreover, a developer is not interested in how much infrastructure (i.e. bare-metal) is behind it and what kind of logo is stuck on it.

How is persistent storage structured in Kubernetes?

Kubernetes provides a Persistent Volume (PV) framework that allows a cluster administrator to provision persistent storage. Persistent volumes are requested by developers or the application deployment department through Persistent Volume Claims (PVC) without knowing about the underlying infrastructure. A PVC not only establishes the connection between a pod and a PV, but can also request a storage class, which requests the corresponding SLAs based on performance (I/O or throughput) and provides the required maximum response time, which are defined in the dynamic storage provisioner. 

These automatisms not only ensure a suitable data area, but also reduce unused reserved storage resources by reclaiming the storage space.

An overview how a PVC yaml looks like:

PVC object definition example

Source of picture : Link
Current 4.8 documentation : Link

How secure is my data on persistent storage?

With this question, we now turn more closely to the original topic of data security. Similar to data availability, data security can take place on different levels. The data should be encrypted as early as possible so that as few opportunities as possible are allowed to tap into the infrastructure stack. Since we have a software-defined storage with OpenShift Data Foundation (formerly OpenShift Container Storage), we are already independent of the infrastructure and only a few framework conditions [2] need to be met. Several combinations allow encryption to be used in persistent storage:

  • Cluster level, to prevent hardware theft or to comply with regulations such as HIPAA
  • Persistent volume level to ensure tenant isolation and privacy or namespace level KMS token control
    • currently for block devices only
  • Cluster and persistent volume level simultaneously
  • In flight encryption, for secure transport of data in the Mutlicloud Object Gateway (MCG) or in the future also via Mutual Transport Layer Security (mTLS) for other storage classes, which is used by default in OpenShift Service Mesh [3]

Overview of the two currently most used methods:

Overview of the two currently most used methods

Since unencrypted data can be read and misused by various systems, these measures are nowadays unavoidable and at least one of the above methods is recommended.

As seen in the picture above, encryption with a key management server is moving from near the disks/OSD to be managed directly in the OpenShift cluster. HashiCorp Vault is the first supported external key manager for encryption of PVCs and isolation for namespaces. The KMS services should be run on a service cluster to serve multiple clusters or applications in parallel.

In the future, the data will also be encrypted in transit via mTLS and ServiceMesh in order to further increase the security of the data and to be able to do without VPN tunnels in hybrid scenarios, for example.

HashiCorp Vault

Short introduction

Vault is one of the most popular open source secrets management systems. It comes with various pluggable components called secrets engines and authentication methods, allowing you to integrate with a wide variety of external systems. The purpose of those components is to manage and protect your secrets in dynamic infrastructure (e.g. database credentials, passwords, API keys). In our scenario Vault acts as an External Key Manager and stores and protects  the encryption keys used for PVC encryption.

How to configure?

The demonstration only covers part of the configuration options and does not go into service users or certificates, for example. I also use the KMS server in the same cluster. According to best practices, an OpenShift ServiceCluster is recommended, which is not only used to deploy  a KMS, but can also host other shared services  such as monitoring (e.g. Thanus for Prometheus), log evaluation, image repository provisioning (e.g. with Quay + Clair) and other operational tasks.

Attached is a picture of a recommended example with a ServiceCluster:

External KMS cluster as a service for several clusters

Both OpenShift clusters connect to the service cluster (left), creating a scalable and highly available design.

Following changes must be checked with a consultant for supportability before the start of production.

Vault can be easily deployed in an OpenShift cluster using the official HashiCorp Helm Chart.

Firstly, we download the values.yaml from Github and customize it for our environment. Many hints regarding OpenShift are already included.

Original values file: Link

The following adjustments must be made for a smooth installation.

Modified values file used in our test environment: Link

After the customisation of values.yaml comes the preparation of the OpenShift Cluster.

1. Check Storage classes

If your cluster has multiple default storage classes annotations then it should be corrected by following command:

# oc patch storageclass managed-nfs-storage -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "false"}}}'Code language: PHP (php)

Default annotations will be removed for storage class managed-nfs-storage

Check again if only one preferred storage class is available: 

# oc get storageclassCode language: PHP (php)
Output from the commands above

2. Deploy Pods

After preparing the storage bake, we create a namespace for the future POD containing the containers with corresponding replicas.

# oc new-project hc-vault --display-name 'KMS'Code language: PHP (php)

With the prepared Helm charts we deploy HashiCorp Vault in a fully automated way.

# helm install vault hashicorp/vault -f <path>/values.yaml -n hc-vaultCode language: HTML, XML (xml)
Helm deployment output

If there is any issue with the helm deployment you can easily delete  using the command below and try to redeploy.

# helm delete vault -n hc-vaultCode language: PHP (php)

Post Installation

After the installation, Vault must be configured to store  encryption keys for ODF in the future and to know which peers are available for high availability.

As a first step Vault must be initialized with the “vault operator init” command. This will return a bunch of unseal keys and a root token. These keys must be saved securely and distributed to different responsible security or operation personnel. The root token gives privileged access to Vault. It can be used for the initial setup and must be thereafter deleted to avoid misuse.

Log in to any Vault node and issue the following commands:

# oc project hc-vault
Now using project "hc-vault" on server "https://api.ocp1.stormshift.coe.muc.redhat.com:6443".

# oc rsh vault-0 
sh-4.4$ vault operator init
Unseal Key 1: mfHrR0uxv4OzarYS4rAsbwNKpYl5y+NC3Frcvrdn/bMu
Unseal Key 2: keNjcSUrfFAsb/bKdX1qp2xWZek2yBMFPnjkYy+k46Rj
Unseal Key 3: 8/Oh8ge+dRcE1ViClB79IHWKwWNSrAEotJ70RSDNqLHE
Unseal Key 4: 1OOIhJmpI3l3CXYherxkC30t3L1HlaOXZafnGZRGVUWf
Unseal Key 5: bkqTcoCqS5/3eGshKwnCJTSnLfBGxKzBbrP+Y+3BMDj8

Initial Root Token: s.VB6eXwWUgiYpJfOuG2SwKDrOCode language: PHP (php)

After all unseal keys have been created, 3 of them must be applied with the vault operator unseal command:

vault operator unseal mfHrR0uxv4OzarYS4rAsbwNKpYl5y+NC3Frcvrdn/bMu
vault operator unseal keNjcSUrfFAsb/bKdX1qp2xWZek2yBMFPnjkYy+k46Rj
vault operator unseal 8/Oh8ge+dRcE1ViClB79IHWKwWNSrAEotJ70RSDNqLHE

For the remaining nodes, the information to join the leader is added and the unseal process is also carried out.

# oc rsh vault-1
# oc rsh vault-2
sh-4.4$ vault operator raft join http://vault-0.vault-internal:8200

sh-4.4$ vault operator unseal mfHrR0uxv4OzarYS4rAsbwNKpYl5y+NC3Frcvrdn/bMu
sh-4.4$ vault operator unseal keNjcSUrfFAsb/bKdX1qp2xWZek2yBMFPnjkYy+k46Rj
sh-4.4$ vault operator unseal 8/Oh8ge+dRcE1ViClB79IHWKwWNSrAEotJ70RSDNqLHE
(tipp: execute each line individually and wait 3 seconds per line)Code language: PHP (php)

Finally, login to Vault to enable the KV secrets engine, write a corresponding policy and generate a token required to configure Vault in ODF.

# vault login s.VB6eXwWUgiYpJfOuG2SwKDrO
# vault operator raft list-peers
# vault auth enable userpass
# vault write auth/userpass/users/vaultuser password='pass321' policies=admins
# vault secrets enable -path=ocs kv
echo 'path "ocs/*" {
  capabilities = ["create", "read", "update", "delete", "list"]
}
  path "sys/mounts" {
  capabilities = ["read"]
 }'| vault policy write ocs -


# vault token create -policy=ocs -format json

Output:
{
  "request_id": "4591c3b2-7abb-8841-519a-0a1485dac31f",
  "lease_id": "",
  "lease_duration": 0,
  "renewable": false,
  "data": null,
  "warnings": null,
  "auth": {
    "client_token": "s.s2xZCUXQfjQizQK4gr5vCqVj",
    "accessor": "1YJCtWwmoISurqPglxqVDveA",
    "policies": [
      "default",
      "ocs"
    ],
    "token_policies": [
      "default",
      "ocs"
    ],
    "identity_policies": null,
    "metadata": null,
    "orphan": false,
    "entity_id": "",
    "lease_duration": 2764800,
    "renewable": true
  }
}Code language: PHP (php)

ODF installation and connecting to the HashiCorp Vault

After the deployment of HashiCorp Vault key management service and configuration steps, you can now deploy the ODF cluster including the encryption method and connect that to the key management server with just a few clicks.

First, go to the Operators > OperatorHub and set a filter for “OCS” or “ODF” (upcoming version 4.9). Click on OpenShift Container Storage, install (settings can remain standard) and proceed with the setup wizard.

OpenShift Operator Hub – Filter “ocs”
Second step of configuration OpenShift Data Foundation and select a preferred encryption method
Add backend path where the tokens will be created without to configure SSL.
Summary of the pending configuration.

PVC Encryption

To use storage encryption at the PVC level, we configure a new storage class where the KMS is enabled.

In addition to the name of the storage class, we also specify the Ceph RBD (Row-Block-Device) provisioner and the storage pool where the volumes will be created and the data saved.

For more information on how encryption is implemented by using cryptsetup with the LUKS extension, see here: Link

Easy and fast configuration of an storage class

In order for the namespace to be allowed to generate encryption  keys, the Vault client token must be provided  as a kubernetes secret in the namespace.

Create the secret key only once in namespace where the application is running.

Use the client_token for the token from the json output from Vault command above:

# cat <<EOF | oc create -f -
---
apiVersion: v1
kind: Secret
metadata:
  name: ceph-csi-kms-token
  namespace: hc-vault
stringData:
  token: "s.s2xZCUXQfjQizQK4gr5vCqVj"
EOF

When creating a PVC, you no longer have to worry about the KMS, because from now on every volume is encrypted when using the  storage class and a new key is generated and stored in the KMS. More information can be found under “Deploy a test application and create PVCs”.

You do not need to worry about performance, because the results of the performance tests show that the performance decreases minimally but not significantly. Please note that key generation and interaction with the vault are minimal. The key is stored in memory to allow faster access.

PVC for a small, but encrypted volume

How can you check if it’s working or troubleshooting

Check the routes and the service in advance, where you can also see the IP address.

Create a local port forwarding with the following command:

# oc port-forward svc/vault-ui 8200:8200 -n hc-vaultCode language: PHP (php)

Then you can log in with the client_token with your local web-browser and see under /ocs if new keys are created.

Before any new PVCs were created it’s empty if you don’t use cluster wide encryption or attached a PVC to a POD in version 4.7.x.

The default content if cluster wide encryption is enabled looks like following screenshot:

Check connection in one of the ODF pods: 

# curl -s http://172.30.47.166c:8200

The second command shows that communication from another container in the cluster to the KMS server is working.

More information about the status can also be requested via the API of HashiCorp Vault:

# curl -s http://vault.hc-vault.svc:8200/v1/sys/health

Deploy an test application and create PVCs

In this example [4] I have deployed a small test application so that the PVCs are accessed and a realistic use case is demonstrated.

# oc login
# cat <<EOF | oc create -f -
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-cephrbd1
  namespace: hc-vault
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Gi
  storageClassName: sc-enc
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-cephrbd2
  namespace: hc-vault
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi
  storageClassName: sc-enc
---
apiVersion: batch/v1
kind: Job
metadata:
  name: batch2
  namespace: hc-vault
  labels:
    app: batch2
spec:
  template:
    metadata:
      labels:
        app: batch2
    spec:
      restartPolicy: OnFailure
      containers:
      - name: batch2
        image: amazon/aws-cli:latest
        command: ["sh"]
        args:
          - '-c'
          - 'while true; do echo "Creating temporary file"; export mystamp=$(date +%Y%m%d_%H%M%S); dd if=/dev/urandom of=/mnt/file_${mystamp} bs=1M count=1; echo "Copying temporary file"; cp /mnt/file_${mystamp} /tmp/file_${mystamp}; echo "Going to sleep"; sleep 60; echo "Removing temporary file"; rm /mnt/file_${mystamp}; done'
        volumeMounts:
        - name: tmp-store
          mountPath: /tmp
        - name: tmp-file
          mountPath: /mnt
      volumes:
      - name: tmp-store
        persistentVolumeClaim:
          claimName: pvc-cephrbd1
          readOnly: false
      - name: tmp-file
        persistentVolumeClaim:
          claimName: pvc-cephrbd2
          readOnly: false
EOF

The following query shows the PVCs created and the keys generated for them.

# oc get pvc                

NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-cephrbd1   Bound    pvc-91948955-afb6-481e-a013-bd7ee628169a   500Gi      RWO            sc-enc         6s
pvc-cephrbd2   Bound    pvc-42e7e369-6eec-477c-a7a1-ecb4e0b4180c   500Mi      RWO            sc-enc         6sCode language: PHP (php)

How it looks like in the HashiCorp Vault?

Note for external access

Since I have only deployed an internal KMS in this blog, which is intended more for testing purposes.The best practice approach is to use an external KMS to have a full Key Data separation. When using  external access, external DNS record will be used instead of the internal DNS to the service vault-hc-vault.svc with an FQDN (Full-Qualified-Domain-Name).

In my example, this would be as follows:

hc-vault-route-hc-vault.apps.ocp1.stormshift.coe.muc.redhat.com

Route details
Configuration with FQDN

Conclusion

The high flexibility in deployment with a software-defined storage OpenShift Data Foundation allows anyone to quickly and easily provide a fully integrated, highly available and encryption-secure persistent storage for developers, operation and ultimately for the application with any internal hard disks or CSI compatible external storages. This opens up all avenues in the area of hybrid cloud deployment and scaling. In addition, you get various dashboards and reporting through full integration of ODF in OpenShift.

Just leave a comment if you wish more information about this topic or the products mentioned in this blog. Any feedback or collaboration is highly appreciated! Our code is open …

Special thanks to Kapil Arora from HashiCorp for assisting with the installation and configuration of HashiCorp Vault (https://www.linkedin.com/in/kaparora/)

[1] Red Hat Data Grid (Infinispan open-source software project) is available to deploy as an embedded library, as a standalone server, or as a containerized application on Red Hat OpenShift Container Platform for Consistent Data Distribution and with in-memory scalable cache modes.

[2] https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.7/html/planning_your_deployment/infrastructure-requirements_rhocs

[3] https://docs.openshift.com/container-platform/4.6/service_mesh/v2x/ossm-security.html#ossm-security-mtls_ossm-security

[4] https://red-hat-storage.github.io/ocs-training/training/ocs4/ocs4-encryption.html#_test_application

Additional sources:

https://www.vaultproject.io/docs/platform/k8s/helm/openshift

https://learn.hashicorp.com/tutorials/vault/kubernetes-openshift?in=vault/kubernetes

https://github.com/hashicorp/vault-helm/blob/master/values.yaml

https://red-hat-storage.github.io/ocs-training/training/ocs4/ocs4-encryption.html

https://red-hat-storage.github.io/ocs-training/training/ocs4/ocs4-encryption.html#\_test\_application

https://catalog.redhat.com/software/containers/hashicorp/vault-k8s/5fda6941ecb524508951c434?container-tabs=gti

https://catalog.redhat.com/software/containers/hashicorp/vault/5fda55bd2937386820429e0c?container-tabs=gti

https://catalog.redhat.com/software/containers/hashicorp/vault-k8s/5fda6941ecb524508951c434