Blog

Containerization with Docker

Containerization with Docker


INTRODUCTION TO DOCKER

If you have been following the “cloud” trends you probably have heard of Docker. It is an open source implementation of the LXC (Linux Containers) used for packaging an application and its needed dependencies into a container that can be deployed and replaced easily.

The containerization in Docker is achieved via resource isolation (cgroups), kernel names spaces (isolating the application’s view of the OS, process trees, etc) and a union-capable file system (such as aufs – mounting multiple directories into one that appears to contain their combined contents).


Using containers removes the overhead of having to create, deploy and maintain full VMs for running your applications. As well as providing completely identical PROD, Staging, QA, DEV environments. In some cases you can even move a container from one server to another, making it ideal to spin a quick instance of your PROD environment on a separate server to do a quick test without messing with the actual PROD environment.

 

With Docker you can package an application and all the dependencies it needs and have it run in any server, making it completely OS independent – all it needs is to communicate with the Host Linux Kernel via the libcontainer library. An example of that is having your Application container run Debian packages and dependencies, your DB running on a CentOS container and your Host being on SuSE Linux.

Docker uses the libcontainer library to interface with the Host’s Linux Kernel and utilize its resources in a predictable way.


Official definition of libcontainer by Docker - libcontainer provides a native Go implementation for creating containers with namespaces, cgroups, capabilities, and filesystem access controls. It allows you to manage the lifecycle of the container performing additional operations after the container is created.

Containers @ Google – Google Search is the world’s biggest implementer of Linux Containers. To run Google Search operations they launch around 7000 containers every second. This amounts to 2 billion every week. Containers are one of the reasons for the speed and reliability of Google Search worldwide.

Container orchestration is a very important part of the Docker and DevOps world, there are many companies that provide orchestration frameworks - Kubernetes, Swarm, Mesos, CentOS flleet, etc

Kubernetes Project (http://kubernetes.io/) – Kubernetes is a container orchestration system, originally designed by Google and open sourced (donated to - Cloud Native Computing Foundation). It is based on the solution used by Google for automating deployment, operations and scaling of containerized applications.


Update - June 2016:
Newest Docker version - 1.12 provides its own built-in container orchestration functionalities called swarm. It works by appointing a manager node and schedules tasks on the other nodes. They in turn take the role of workers that receive tasks from the manager (such as running ad-hoc containers on demand).
Creating a swarm is as simple as this :
docker swarm init


DOCKER SPECIFICS

AUFS – Union Filesystems
Union-filesystems are lightweight and fast. They operate by creating layers. Aufs provides the following features to Docker :
  • Fast container startup times.
  • Efficient use of storage.
  • Efficient use of memory.
Aufs is a unification filesystem. Which means it takes multiple directories on the Linux Host, stacks them on top of each other, and provides a single unified view.

Within Docker AUFS provides image layering. Each Docker image references a list of read-only layers that represent filesystem differences. Layers are stacked on top of each other to form a base for a container’s root filesystem. The Docker storage driver stacks the layers and provides a unified view of them. When you create a new container you add a new writeable layer on top of the underlying stack (making changes to files, folders, deleting within the container are made in this layer).



IMAGES AND CONTAINERS

A Docker image is an ordered collection of root filesystem changes and the corresponding execution parameters for use within a container runtime.

To build an image you use a Dockerfile that contains a set of instructions that tell Docker how to build the image. For example:

  • Fetch a base image such as - Debian, CentOS, Unbutu, etc.
  • Install all of your application dependencies using yum or apt-get.
  • Copy or make some changes to config files.
  • Extract some zip of files.

Then you build your Docker image using the Dockerfile that you just created:
docker build -t user/repo_name:v2 .

A Docker container is a runtime instance of a Docker image.
To run a Docker container from your image:
docker run -d --name "test_container" –v /host/src/dir:/opt/dir -p="80:8080" user/repo_name:v2
(-d for detached ; --name to name your container ; mount directory /host/src/dir inside container as /opt/dir ; -p to assign ports from Host to Container and finally the image to run container from)

The container start time is usually extremely fast – average of 500ms.

A container consists of
  • A Docker image
  • Execution environment
  • A standard set of instructions
The main difference between containers and images is the top writable layer that is added by a container when it is run from an image. All changes (adds, deletes, modifications) are stored in the writeable layer. When the container is deleted the writeable layer is also deleted (the underlying image remains unchanged). Therefore application data is usually stored in a mounted Data volume from the Host.


DOCKER MOUNTED DATA VOLUMES

Data volumes are designed to make data persistent, independent of the container’s life cycle.

Using a Data mounted volume creates a specifically designed directory within one or more containers that is excluded from the UnionFS. Data volumes are persistent even if the container is removed. They can be shared between multiple containers on the same Host.




Data volumes are created with the --volume or -v flag in a docker run command. For example:

docker run –v /host/src/dir:/opt/dir


DEVOPS WITH DOCKER – USE CASE EXAMPLE

Our example use case was aimed at solving some workflow issues and improving on the following DevOps processes:

  • Having to maintain application environments, dependencies and software patching on multiple machines separately. Reducing the occasional differences between Development, Testing and Production environments.
  • Ability of QA Teams to test on an exact replica of Production (and to very quickly spin new instances of it in cases where it has to be done multiple times).
  • Version control on both Application and Environment configuration level.
  • Better security – Due to Application isolation from the other components on the host. We could run multiple applications with their associated databases without them seeing each other. Linking the Applications to a DB without having to expose any ports on the Host machine.
  • Reducing the cost of having to run multiple VMs for each application.
  • The ability to completely automate the build and deploy of the whole application stack, reducing the operations workload required.
The majority of our implementation was made with the following tools:

  • Docker
  • GIT - for version control of our Dockerfiles and App/DB configuration
  • Bamboo - Continuous Integration tool – for build and deployment of Docker images
  • Vagrant - for deploying local container instances of previously built images
  • Docker Trusted Registry - for having our own locally hosted registry of Docker images
Please don't pay attention to how poorly looking this diagram is :)



All Dockerfiles were created in cooperation between the Development and Operations teams to take into account environment, application and OS requirements. The Dockerfiles were developed using Git for versioning and collaboration between the two teams.

The continuous integration was implemented in Bamboo. Each build plan performs a git checkout of the branch with the Dockerfile, conf files and other installation scripts, builds the Docker image and pushes it into the private registry (Docker Trusted Registry).

There is a logical separation of base and application Docker Images for each application to decrease the amount of maintenance and build time needed on each component. The base image contains essential environment components and is built once initially and eventually when changes to the environment itself are required (i.e. changing configuration parameters, application server version, etc). The application image fetches the base image and builds on it the application specific configuration and dependencies than pushes the built image into the private registry.

The deployment process is performed in the CI tool, by just executing a docker run on the destination Host with the specific container runtime configuration (like environment variables, data volumes, open ports, links with DB container, etc). During deployment we perform some additional steps such as – stopping the currently running App and DB containers, removing them and their associated image, also after deployment is complete and containers are up, an automatic configuration of the DB Replication is performed with the other DB containers running the same Application database.

This process allowed for the complete automation of all environments with different build and deployment plans for Production, Staging, QA, Dev.

With the help of Front-End web servers and having multiple instances of our PROD environments, we were able to achieve no downtime during deployment.

Additionally Vagrant was used locally by Development and QA Teams to automate the creating and destroying of new instances of the applications using exactly the same images as used in Production/Staging.