Skip to Content
Technical Articles
Author's profile photo Gunter Albrecht

Kyma – Kubernetes on SAP BTP: Where’s my data?

It’s been a while since I walked my first steps using Kyma on SAP Business Technology Platform (BTP). If you haven’t had the chance to try it out you might want to read my previous blog first where I explain pods, deployments, services, volume claims and more: Kyma, Kubernetes, Kanban? Deploying a multi-service application on SAP BTP

I’ve made some progress meanwhile deploying all kinds of applications through Kyma and the speed and ease to get them up and running is satisfying! That is, if all works as expected 😁

Abstract

We will learn how Kyma handles data storage on AWS, Azure, GCP for persistent volume claims (PVCs) and how it can be monitored during operation.

We then back up a PVC by taking a volume snapshot. Finally we restore a volume from its snapshot.

I should mention I’m a mere user of Kyma/ Kubernetes therefore I encourage you to comment!

Data storage concept

Simply put, Kubernetes is great. Upload a magic yaml-file and 30 seconds later you have a running application with database, backend, frontend, what have you! You need storage space for your database that stays when a pod crashes? Just define a PVC and state the size you need! No more buying storage, it’s provided by Kyma!

Eh, wait – that can’t be. Some place must offer this space to Kyma! 🤔

Indeed. The underlying Hyperscaler, be it AWS, Azure or GCP provides the storage in certain GiB steps. And for sure this storage is paid like any other. Kyma handles this storage management within the boundaries of the minimum steps and maximum storage size. Actually, this is not Kyma but an abstraction layer called Gardener but let’s just stop here.

Let’s look at the lifecycle of your persistent volume claims:

  1. You create a namespace – that’s mandatory to get started.
  2. You define a PVC inside the namespace of a certain size – Nothing happens.
  3. You refer to the namespace from a pod or deployment definition – Now the PVC is physically created and made available.
  4. You use your application. The PVC can be accessed for read/write operation if you didn’t say else. That consumes the PVCs space.
  5. Your pod crashes – no impact on the PVC. The deployment will automatically create a new pod and life goes on. 😅
  6. You delete your deployment – no impact on the PVC and its data. However, your application is down if you don’t recreate it.
  7. You delete the namespace in which the PVC resides – Say good bye to your data. It’s gone. Don’t delete namespaces lightly. 😁

Monitoring the PVCs

You can use the command line tool kubectl to do that. That would look like this:

$ kubectl get pvc
NAME                     STATUS   VOLUME                                                           CAPACITY   ACCESS MODES   STORAGECLASS   AGE
db-data-claim            Bound    pv-shoot--kyma--c-88f9b3b-dc046771-ca16-4533-ac81-61957f194747   3Gi        RWO            default        74m
public-user-data-claim   Bound    pv-shoot--kyma--c-88f9b3b-52a4844f-4a42-4b39-a5f8-5fe7f55b508d   1Gi        RWO            default        70m
user-data-claim          Bound    pv-shoot--kyma--c-88f9b3b-b6161428-12b1-4f36-a108-6950e6de7f5f   2Gi        RWO            default        74m

you can get more information this way:

$ kubectl describe pvc  db-data-claim
Name:          db-data-claim
Namespace:     leantime-prod
StorageClass:  default
Status:        Bound
Volume:        pv-shoot--kyma--c-88f9b3b-dc046771-ca16-4533-ac81-61957f194747
Labels:        leantime.service=db-data-claim
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
               volume.kubernetes.io/selected-node: ip-10-250-13-188.ap-southeast-1.compute.internal
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      3Gi
Access Modes:  RWO
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      snapshot-db-data
Mounted By:  db-747b89d64d-kxkh9

Good – you can also use the Kyma monitoring which calls up Grafana.

Kyma%20Cockpit%20-%20Jump%20to%20logs%20through%20clicking%20the%20button

Kyma Cockpit – Jump to logs through clicking the button

You will then see the storage situation (among many other KPIs).

Persistent%20volume%20claim%20monitoring%20in%20Grafana

Persistent volume claim monitoring in Grafana

Taking a snapshot of a PVC

You can take a snapshot anytime. This will freeze the content of the pvc and copy it to a separate snapshot file of the volume. Why would you want to do it? Likely you need to have backups of your volumes to restore a database or file-system in case the application is messing it up or a user deletes important data.

For that you can create a yaml file and upload it to Kyma. That would look like this for above example:

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: snapshot-db-data-for-blog
spec:
  volumeSnapshotClassName: default
  source:
    persistentVolumeClaimName: db-data-claim

I will now upload it to Kyma and quickly check through the command line for the result.

$ kubectl get volumesnapshot snapshot-db-data-for-blog
NAME                        READYTOUSE   SOURCEPVC       SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS   SNAPSHOTCONTENT                                    CREATIONTIME   AGE
snapshot-db-data-for-blog   false        db-data-claim                                         default         snapcontent-64eccaaf-b928-48be-92cf-07f02ae25d95                  1s

One second after upload of the snapshot definition we see it exits. However, the READYTOUSE flag is false.

$ kubectl get volumesnapshot snapshot-db-data-for-blog
NAME                        READYTOUSE   SOURCEPVC       SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS   SNAPSHOTCONTENT                                    CREATIONTIME   AGE
snapshot-db-data-for-blog   true         db-data-claim                           3Gi           default         snapcontent-64eccaaf-b928-48be-92cf-07f02ae25d95   86s            2m3s

A bit later the snapshot is ready and it took 86 seconds for that.

Restoring a PVC from a snapshot

Finally let’s use this snapshot to restore a PVC. For that we intentionally delete the deployment and subsequently the referenced PVC. This can be done through CLI:

$ kubectl delete deployment db
deployment.apps "db" deleted
$ kubectl delete pvc db-data-claim
persistentvolumeclaim "db-data-claim" deleted

Let’s hope and pray we can bring it back! 😱

For that we create a yaml file like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    leantime.service: db-data-claim
  name: db-data-claim
spec:
  dataSource:
    name: snapshot-db-data-for-blog
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

And load it up to Kyma. We check the command line again for the PVC.

$ kubectl get pvc
NAME                     STATUS    VOLUME                                                           CAPACITY   ACCESS MODES   STORAGECLASS   AGE
db-data-claim            Pending                                                                                              default        7s

Pending? Yes, remember the life-cyle explained at the beginning of the blog? We can also get more details that way:

$ kubectl describe pvc db-data-claim
Name:          db-data-claim
Namespace:     leantime-prod
StorageClass:  default
Status:        Pending
Volume:
Labels:        leantime.service=db-data-claim
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      snapshot-db-data-for-blog
Mounted By:  <none>
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  WaitForFirstConsumer  4s (x8 over 106s)  persistentvolume-controller  waiting for first consumer to be created before binding

Ok, so let’s deploy the db instance again (“deploy the deployment” 😂) by uploading the yaml file (you can check my first blog if you want to know how such a file looks like).

Done! Let’s check the command line once more:

$ kubectl describe pvc db-data-claim
Name:          db-data-claim
Namespace:     leantime-prod
StorageClass:  default
Status:        Bound
Volume:        pv-shoot--kyma--c-88f9b3b-16436311-b704-4621-86bb-1ba3f369faf4
Labels:        leantime.service=db-data-claim
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
               volume.kubernetes.io/selected-node: ip-10-250-13-188.ap-southeast-1.compute.internal
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      3Gi
Access Modes:  RWO
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      snapshot-db-data-for-blog
Mounted By:  db-747b89d64d-6r4xv
Events:
  Type    Reason                 Age                  From                                                                                         Message
  ----    ------                 ----                 ----                                                                                         -------
  Normal  WaitForFirstConsumer   71s (x17 over 5m8s)  persistentvolume-controller                                                                  waiting for first consumer to be created before binding
  Normal  ExternalProvisioning   65s (x2 over 65s)    persistentvolume-controller                                                                  waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator
  Normal  Provisioning           61s (x2 over 65s)    ebs.csi.aws.com_csi-driver-controller-746cd95894-swwnb_cd3a8cdb-e4d2-4583-9e6b-cf8ca3916b0f  External provisioner is provisioning volume for claim "leantime-prod/db-data-claim"
  Normal  ProvisioningSucceeded  61s (x2 over 61s)    ebs.csi.aws.com_csi-driver-controller-746cd95894-swwnb_cd3a8cdb-e4d2-4583-9e6b-cf8ca3916b0f  Successfully provisioned volume pv-shoot--kyma--c-88f9b3b-16436311-b704-4621-86bb-1ba3f369faf4

And our database is back – with the data we had before deletion.

Summary

We went through the life-cycle of a PVC, monitoring, taking a snapshot and data restore/ recovery. Let me know what you experiences were, your questions or challenges! 😌 Have a great day ahead!

 

 

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Benny Schaich-Lebek
      Benny Schaich-Lebek

      Hi Gunter and your fellow readers,

      that nicely explains some of the cool features of Kyma and Kubernetes but at the same time gives me kind of a headache for certain use cases.

      While the use case you have in mind (acting on database files) is totally reasonable, some people may interpret this as a general way of file handling.

      However, this transports the old paradigm of file handling that we all have in our heads and I wish I could erase that. What I usually tell people about cloud is that "there are no files anymore" (which we all know is wrong, of course) to make them step out of their comfort zone and see "THE LIGHT" 😉

      So, for anybody who wants to handle files, ALWAYS first consider document management. If it totally makes no sense, then go ahead with the approach mentioned here.

       

      And another one: this use case is for provisioning of you own database. Think about it: one of the biggest advantages of cloud is to use managed services. Introducing your own database means you will have to continue to maintain it. Is that worth it? It may be if it is an interim solution for a database planned to become a managed service on Kyma. But as I'm not from PM; I'm not entitled to talk about those plans and unfortunately they do not yet appear on the road map. - Well, just checked: PostgreSQL now IS on the roadmap (https://roadmaps.sap.com/board?PRODUCT=73554900100800003012&range=CURRENT-LAST#Q2%202022)

      I hope that helps some of you,

      Benny