PostgreSQL Database with Kubernetes for Performance and Collaboration

In some earlier tutorials we introduced Kubernetes cluster manager and discussed using it with MySQL database, Oracle Database, and Oracle NoSQL Database. Kubernetes cluster manager orchestrates a cluster of Docker containers, providing replication, auto scaling, rolling updates, service abstraction, high availability, and load balancing. In this tutorial we shall discuss using PostgreSQL with Kubernetes.

How Does Kubernetes Improve Performance?

Kubernetes makes feasible a microservices architecture which has several benefits as compared to the monolithic architecture. Kubernetes is suitable for developing scalable, reliable database services. A Pod is a set of one or more containers, each of which is running an application (such as a database), which could be same or different across a Pod. A service is an abstraction over a set of Pods for clients to be able to invoke a Pod application.

Earlier versions of Docker did not provide any native cluster management features. But Docker Engine 1.12 integrates the Docker Swarm cluster manager. Docker Swarm does overcome some of the earlier limitations of Docker by providing replication, load balancing, fault tolerance, and service discovery, but Kubernetes provides some features suitable for developing object-oriented applications. A Pod abstraction is the atomic unit of deployment in Kubernetes. A Pod may consist of one or more containers. Co-locating containers has several benefits as Docker containers in a Pod share the same networking and file system and run on the same node, which reduces the network latency between the containers. Docker Swarm does not support auto scaling directly. While Docker Swarm is Docker native, Kubernetes is more production ready, having been used in production at Google for more than 15 years.

How Does Kubernetes Improve Collaboration?

Pods at the backend of a Kubernetes service collaborate to serve client requests. Client traffic to a Kubernetes service is routed to one of the Pods with round robin load balancing which is transparent to the client accessing a service. A service is loosely coupled with the Pods it represent, which makes it easier to modify or update either of the service or the Pods tier. Pods may be added or removed without an interruption of service.

Kubernetes Helm charts provide collaboration among users. With Helm charts resource definitions developed by end users may be shared among other users who might have a similar application requirement. Helm is a package manager for Kubernetes, just as YUM is a package manager for RPM based systems. Helm charts are the packages managed by Helm, similar to the RPM packages. Helm charts are curated, reusable application definitions for Kubernetes Helm.

In this tutorial we shall discuss using PostgreSQL database with Kubernetes. The benefits of Docker container are a lightweight, scalable platform for developing applications and using software in a virtual-machine-like environment, with each Docker container being a snapshot of the underlying OS. A virtual machine, in contrast, packages a whole OS, which could be several GB and not recommended for developing lightweight and scalable applications.

Setting the Environment

In this example, we have used Ubuntu on Amazon EC2 to run a multi-node (2 nodes) Kubernetes cluster.

1185.deepakv_PostgreSQL_article_01

Two approaches may be taken to running PostgreSQL database on a Kubernetes cluster. One is the declarative approach, in which the database replica and service configuration are provided using YAML configuration files. The second approach is the imperative-style command-line. We shall discuss each of these.

Configuring the Replication Controller and Service Declaratively

Create two YAML files, one for the replication controller definition and the other for the service definition. The postgres-rc.yaml is for a replication controller and the postgres-service.yaml is for the service.

0537.deepakv_PostgreSQL_article_02

In the postgres-rc.yaml specify two replicas and a Pod spec for the Docker image “postgres”. The Pod template metadata includes a label “app: postgres”, which translates to app=postgres. The postgres-rc.yaml is listed:

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: postgres
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        -
          image: postgres
          name: postgres

Validate the YAML syntax using some YAML validator such http://www.yamllint.com/ as improper spacing could result in a syntax error. To validate the schema conformance of the replication controller definition file the schema is available at http://kubernetes.io/docs/api-reference/v1/definitions/#_v1_replicationcontroller. The RC YAML file is shown in a vi.

3252.deepakv_PostgreSQL_article_03

The postgres-service.yaml file defines a Service. The selector expression should be the same as a Pod template label in the replication controller for the service to route traffic to the Pods. The spec label is set to “app: postgres”, which translates into app=postgres, and is the same as the Pod template label in the postgres-rc.yaml file.

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: postgres
  name: “postgres”
spec:
  ports:
    -
      port: 5432
  selector:
    app: postgres

Validate the YAML file. The YAML validator only validates the syntax and not the schema conformance. The Service schema is available at http://kubernetes.io/docs/api-reference/v1/definitions/#_v1_service. The service YAML file is shown in a vi.

2604.deepakv_PostgreSQL_article_04

Creating a PostgreSQL Service Declaratively

To create the Kubernetes service from the service definition file, run the following kubectl create command.

kubectl create -f postgres-service.yaml

Subsequently list the services.

kubectl get services

As the output indicates, a service called “postgres” gets created and listed.

8032.deepakv_PostgreSQL_article_06

Describe the service.

kubectl describe svc postgres

The service description lists the service name, namespace, labels, selector, and IP. No service endpoints are initially listed, as Pods have not been created yet.

7484.deepakv_PostgreSQL_article_07

Creating a PostgreSQL Replication Controller Declaratively

Having created a service, next we shall create a replication controller. Run the following kubectl create command to create a replication controller from the postgres-rc.yaml file.

kubectl create -f  postgres-rc.yaml

A replication controller called “postgres” gets created.

4722.deepakv_PostgreSQL_article_08

Subsequently list the replication controllers.

kubectl get rc

And describe the rc.

kubectl describe rc postgres

The “postgres” rc gets listed. The rc description indicates that 2 replicas are running.

3264.deepakv_PostgreSQL_article_09

List the pods.

kubectl get pods

Two Pods get listed as running and ready.

6888.deepakv_PostgreSQL_article_10

When we first listed the service description no service endpoints were listed. List the svc again.

kubectl describe rc postgres

Two service endpoints corresponding to the two Pods get listed. The service endpoints route traffic to Pods.

16032.deepakv_PostgreSQL_article_11

Creating a PostgreSQL Database Table

In this section we shall create a database table. Two alternatives are available to start an interactive shell. One is to use the kubectl exec command to start a tty for a pod. The second is to use the docker exec command with a Docker container; this alternative is what we shall use. List the Docker containers with docker ps.

2678.deepakv_PostgreSQL_article_12

Obtain the container id for one of the Docker containers for the postgres Docker image and use the docker exec command to start a bash shell.

sudo docker exec -it <container id> bash

A bash shell gets started to connect to a Pod. Start a login shell for PostgreSQL database “postgres”.

su –l postgres

Start the PostgreSQL client terminal.

psql postgres

The postgres=# command prompt should get displayed.

0121.deepakv_PostgreSQL_article_13

Run SQL script to create a database table called wlslog and add data to the database.

CREATE TABLE wlslog(time_stamp VARCHAR(255) PRIMARY KEY,category VARCHAR(255),type VARCHAR(255),
servername VARCHAR(255),code VARCHAR(255),msg VARCHAR(255));

INSERT INTO wlslog VALUES('Apr-8-2014-7:06:16-PM-PDT','Notice',
'WebLogicServer','AdminServer','BEA-000365','Server state changed to STANDBY');

INSERT INTO wlslog VALUES('Apr-8-2014-7:06:17-PM-PDT','Notice',
'WebLogicServer','AdminServer','BEA-000365','Server state changed to STARTING'); 

INSERT INTO wlslog VALUES('Apr-8-2014-7:06:22-PM-PDT','Notice',
'WebLogicServer','AdminServer','BEA-000360','Server started in RUNNING mode');

A database table gets created.

4745.deepakv_PostgreSQL_article_14

Subsequently select data from the database table.

SELECT * FROM wlslog;

The table data gets listed.

7450.deepakv_PostgreSQL_article_15

Quit the PostgreSQL client terminal and exit the bash shell.

5811.deepakv_PostgreSQL_article_16

Scaling the Pod Cluster

The Pod cluster may be scaled to increase/decrease the number of replicas. As an example, increase the replicas from 2 to 4 with the kubectl scalecommand.

kubectl scale rc postgres --replicas=4

Subsequently list the Pods.

kubectl get pods

Initially some of the Pods may not be ready and/or running. Run the preceding command again after a few seconds and all the Pods should be running.

7077.deepakv_PostgreSQL_article_17

The nodes the Pods run on may also be listed.

kubectl get pods -o wide

The nodes for the Pods also get listed.

1682.deepakv_PostgreSQL_article_18

Describe the service postgres again.

kubectl describe svc postgres

The service endpoints should have increase to four.

5315.deepakv_PostgreSQL_article_19

Deleting the Replication Controller

Delete the replication controller with the kubectl delete command.

kubectl delete rc postgres

Subsequently list the replication controllers.

kubectl get rc

Also describe the service.

kubectl describe svc postgres

The rc gets deleted and does not get listed subsequently. The svc does not list any service endpoints.

3757.deepakv_PostgreSQL_article_20

Stopping the Service

Stop the service “postgres” and subsequently list the services.

kubectl stop service postgres
kubectl get services

The service gets stopped and does not get listed subsequently.

1106.deepakv_PostgreSQL_article_21

Creating a Replication Controller and Service Imperatively

Next, we shall discuss creating a PostgreSQL database Pod cluster imperatively without configuration files. Create a replication controller called “postgres” from Docker image “postgres” with 2 replicas and port as 5432.

kubectl  run postgres --image=postgres --replicas=2 --port=5432

An rc gets created.

8547.deepakv_PostgreSQL_article_22

Subsequently list the replication controllers.

kubectl get rc

Also list the Pods.

kubectl get pods

Two Pods get listed.

1614.deepakv_PostgreSQL_article_23

Next, expose the rc as a service of type LoadBalancer on port 5432.

kubectl expose rc postgres --port=5432 --type=LoadBalancer

Subsequently list the services.

kubectl get services

A “postgres” service gets created and listed.

7043.deepakv_PostgreSQL_article_24

Describe the service to list two service endpoints.

0118.deepakv_PostgreSQL_article_25

Subsequently, a database table may be created in a PostgreSQL client terminal as discussed in the preceding sections.

Listing the Database Logs

The logs for a Pod may be listed with the kubectl logs command.

kubectl logs postgres-dtyms

The output from the command should show the database getting initialized.

3348.deepakv_PostgreSQL_article_26

If the database was also shut down the logs include the database shutdown.

6153.deepakv_PostgreSQL_article_27

In this tutorial we discussed using PostgreSQL database with Kubernetes. Kubernetes microservices architecture provides performance benefits to develop a scalable and reliable database service. Kubernetes also provides collaboration features such as multiple Pods collaborating to serve client requests and Helm charts for collaboration on Kubernetes resources.

How Does Kubernetes Improve Performance?

How Does Kubernetes Improve Collaboration?

Setting the Environment

Configuring the Replication Controller and Service Declaratively

Creating a PostgreSQL Service Declaratively

Creating a PostgreSQL Replication Controller Declaratively

Creating a PostgreSQL Database Table

Scaling the Pod Cluster

Deleting the Replication Controller

Stopping the Service

Creating a Replication Controller and Service Imperatively

Listing the Database Logs

About the Author

Deepak Vohra

About the Author

Start the discussion at forums.toadworld.com