Docker

Explanation of the architecture of Docker and how to create a running container of your app and your PostgreSQL database.

Mental picture: with Docker you put your application in an environment that you can control extensively. To create this environment, you either create an image (write a Dockerfile, then build it using ‘build’) or you use an existing image. ‘Environment’ in this context means the filesystem, which is the directory structure plus all its contents, as seen from inside the container.

Docker architecture

Everything you need for Docker can be pulled in by installing Docker Desktop. It includes Docker cli, so you can run commands from a shell, and also the so-called daemon, or dockerd, that manages the containers. These two communicate with a REST API, by default over UNIX sockets but you can configure it to communivcate over a network interface (less safe but suitable if client and daemon live on different machines).

Another important element in Docker is the registry, where images are centrally stored. The default registry is Docker Hub but you can create your own, which you might typically do in an organization.

Furthermore there are Buildx and BuildKit, that are responsible for building images. Buildx is the client and is invoked once you use the docker build command on the cli, BuildKit is the backend that receives instructions from Buildx and does the actual building process.

The other relevant elements of Docker are Docker Compose, a tool to run and define multi-container applications, and Docker Swarm, an orchestration service. The latter is an alternative for Kubernetes.

Underlying Linux and other software

Docker relies on three Linux elements that predated Docker, namely chroot, cgroups and namespaces. The existence of these kernel features shows that the creation of isolated environments, where applications cannot see the underlying OS filesystem or its environment variables, was recognized as something important early on.

Apart from these Linux kernel features, Docker also relies on containerd as the basis for individual containers. Containerd is an open source tool that provides a layer of abstraction upon the basic Linux kernel features and specializes in the running of containers. It gets its instructions from dockerd. Between the kernel features chroot, cgroups and namespaces and containerd lives another creature named runc. It is not part of Linux and written in Go.

Both containerd and runc play a role in the cloud and in kubernetes, and are highly relevant outside of the Docker context.

About images

A container is a process, an image is a file

While the term container suggests something similar to a file, or at least something that will exist even if the machine is switched off, it isn’t. A container is a process based upon an image, which actually is a (large) file. Once you have built a container on a machine you can start and stop it and it won’t have to rebuild or rerun every time, but migrating it means that it needs another ‘docker run’ command on the other system.

Image as a layered construct

An image is built following a recipe found in a dockerfile that you, or someone else, has composed. A dockerfile contains a list of instructions, whereby every instruction affects the resulting filesystem and its contents (the ‘environment’). While as a user you are only interested in the final filesystem, the image actually stores multiple filesystems, one for every instruction. Each is called a layer.

The rationale for layering is that these layers can be reused which saves time. Say that two different and unrelated images each install some application, resulting in all sorts of new files and folders added to the environment, then only the first install has to take place. The second image can use the layer that is the result of the first installation.

I asked ChatGPT what this means if you have instructions that delete content. Say I add an instruction that copies 100Mb of files to the environment/filesystem, and a next instruction deletes them. The final result is the same as if no instructions had taken place, but the image will still contain a layer containing all the copied files. The image will have those 100Mb stored as part of the first layer.

One consequence of this is that if you want to limit the size of the final image and thus want to clean up during the build process, you cannot delete the mess you made in some instruction (“copy all sorts of temporary installation files to my environment”) in a next instruction (“now delete the garbage”). The workaround is in the RUN instruction, that allows you to chain commands with &&. As both the mess making and cleaning happens within the same instruction, no residue will persist in the final image.

Creating an image with Dockerfile

To create an image you first create a dockerfile. Name it Dockerfile, without an extension. By doing so you can run the docker build command in the same folder and it will pick up the instructions in this file.

A Dockerfile is an ordered list of instructions. Every instruction adds something (or eventually subtracts something) to the image and thus to the filesystem in the container. There are 18 different instruction types, the dockerdocs lists and explains them here. A more practically oriented manual, also from the website, is found here.

Of the 18 commands, FROM <image> and either CMD <command> or ENTRYPOINT <command> are required. Below is a full listing. The order is alphabetical but I changed it slightly to be able to have instructions with high similarity together:

ADD

Adds local or remote files and directories to the container filesystem. The form is one of these two, the latter to be used when there is whitespace in the paths/filenames:

ADD [OPTIONS] <src> <dest>

ADD [OPTIONS] ["<src>", "<dest>"]

The choice between these two forms exists for other instruction types as well. Instead of copying one source to one destination you can also copy multiple sources to the same destination in one line. The last argument must always be the destination.

If the destination must be a directory, the path must end with a slash (/), otherwise Linux thinks it is a file. This rule is not applied to the source argument, here trailing slashes are disregarded.

The last argument must be a directory when you have multiple source arguments (which can both be files and directories). The following lines are correct:

ADD ["file1.txt", "file2.txt", "/usr/src/datastuff/"]

ADD file.txt /home/geert/copytocontainer/ /usr/src/datastuff/

It is also possible to add a source from a remote location. The documentation shows these valid examples:

ADD https://example.com/archive.zip /usr/src/things/
ADD git@github.com:user/repo.git /usr/src/things/

If the source is a local tar archive (any compressed file), it is decompressed and extracted to the specified location. There are some more ins and outs, and you can use some options. The link directly to the ADD reference is here.

COPY

Copy is very similar to add. General rule: for local files/folders use COPY, for remote content use ADD. Docker discusses the differences here and here you find the COPY reference.

ARG

This instruction defines a variable that users can pass at build-time, using the --build-arg <varname>=<value> flag. You can provide a default value so a user can omit the argument, and one ARG instruction can define multiple arguments. Something similar to the Main function in Java. The form is this:

ARG <name>[=<default value>] [<name>[=<default value>]...]

Note that if you have multiple arguments and you want defaults for some, you have to start from the end with default values.

To keep it simple, you can write an ARG instruction for every distinct argument. The values of the arguments are available to subsequent instructions, you can for example use them in a RUN instruction.

Arguments passed by ARG are ephemeral, unlike ENV variables they are not stored in the image and thus not available after the build. More details in the reference.

ENV

In the Dockerfile you can set environment variables that are then available to the running container. The values can be hardcoded in the Dockerfile but you can also write the code so that they can be passed as argument in the build process, using ARG. Example:

FROM ubuntu
ARG CONT_IMG_VER
ENV CONT_IMG_VER=${CONT_IMG_VER:-v1.0.0}
RUN echo $CONT_IMG_VER

The snippet above lets the user set CONT_IMG_VER. If for some reason no argument is passed, the environment variable is set to default (-v1.0.0), but this can be overridden with the following build command:

--build-arg CONT_IMG_VER=-v2.0.0

Be aware that setting environment variables in the build process is different from setting it in the run stage. An example below shows how the official PostgreSQL image requires the user who runs (not builds) the container to set values for specific environment variables using the -e flag. More on ENV here.

CMD

The CMD instructions sets the command to be executed when running a container from an image. There has been a lot of misunderstanding about the differences between CMD and ENTRYPOINT, see here for a Stackoverflow discussion.

One rule always applies: you need at least one of them in the Dockerfile, and if you have both, it is ENTRYPOINT that specifies the command to be executed, whereby eventually CMD will provide arguments to that command.

To make things extra confusing, it is possible to specify CMD instructions using the shell form or the exec form. The difference is explained in the reference.

ENTRYPOINT

If you want that the same command is executed every time the container is being run, use ENTRYPOINT. Like CMD, it has the exec form, described as the preferred form, and the shell form. See reference.

About the shell form: the shell form means that a command is executed via /bin/sh, which is a very basic shell in Linux. The command will get a form like below, which means as much as: start the shell, write the following line and type enter (-c is the ‘command’ flag).

/bin/sh -c exec_entry p1_entry

EXPOSE

See here. Specifies the port or ports at which the container listens during runtime. You can specify if it needs to be TCP or UDP.

But: according to the reference. The EXPOSE instruction doesn’t actually publish the port. It functions as a type of documentation between the person who builds the image and the person who runs the container, about which ports are intended to be published. To publish the port when running the container, use the -p flag on docker run to publish and map one or more ports, or the -P flag to publish all exposed ports and map them to high-order ports.

So in the end, what you specify in the run command is decisive.

FROM

The FROM instruction initializes a new build stage and sets the base image for subsequent instructions. As such, a valid Dockerfile must start with a FROM instruction. The image can be any valid image.

There can be multiple FROM instructions in a Dockerfile, here the concept of stage is relevant. More here.

Normally FROM must be the first instruction but it can be preceded by ARG. Details on this in the reference.

HEALTHCHECK

HEALTHCHECK allows you to periodically run some command inside the container. You provide exit codes for the command (let’s say a simple command that checks if a certain website reponds in time), whereby 0 is healthy and 1 is unhealthy.

When the status changes, a health_status event is generated. With docker inspect you can query this status.

LABEL

A label is a key-value pair (key=value or “key”=”value”). You can use the instruction multiple times or add multiple key-value pairs in the same instruction. It is important to use double and not single quotes. No quotes is fine as well, beware of spaces.

To see the lables use the inspect command:

docker image inspect --format='' myimage

You can use other formats as well but the option above shows only the labels.

MAINTAINER (deprecated)

Use LABEL instead. Used to be used to set the author field of the generated image.

ONBUILD

The ONBUILD instruction adds to the image a trigger instruction to be executed at a later time, when the image is used as the base for another build. Multiple ONBUILD instructions can be included in a Dockerfile. They are executed as part of the FROM instruction in a build that uses the image as base.

Example from the reference:

ONBUILD ADD . /app/src
ONBUILD RUN /usr/local/bin/python-build --dir /app/src

RUN

Important instruction. Refernce here. Can be used in shell- or exec form. Form of both:

# Shell form:
RUN [OPTIONS] <command> ...
# Exec form:
RUN [OPTIONS] [ "<command>", ... ]

The shell form is the most used form, it is easier to create multiline commands this way.

Because of the importance of this command, here the option list:

Option	Description
–device	Allows build to request CDI devices to be available to the build step. ?
–mount	Allows you to create filesystem mounts that the build can access.
–network	Allows control over which networking environment the command is run in.
–security	Some security setting.

The interesting one is --mount. This is what ChatGPT tells:

“With RUN –mount, you can temporarily mount something into the build container during that RUN step only. It doesn’t persist into the resulting image unless you explicitly copy data from it.”

This makes it a good tool to pass data or secrets that you only need during the build process. It prevents leaks of sensitive data (passwords, ssh keys) and it prevents burdening the image with data that is not required after the build. Also, it can speed up the build (there is some ‘cache’ option specially for this purpose).

SHELL

According to the reference, “the SHELL instruction allows the default shell used for the shell form of commands to be overridden. The default shell on Linux is [“/bin/sh”, “-c”], and on Windows is [“cmd”, “/S”, “/C”]. The SHELL instruction must be written in JSON form in a Dockerfile.”

STOPSIGNAL

“The STOPSIGNAL instruction sets the system call signal that will be sent to the container to exit. This signal can be a signal name in the format SIG, for instance SIGKILL, or an unsigned number that matches a position in the kernel's syscall table, for instance 9. The default is SIGTERM if not defined." See [here](https://docs.docker.com/reference/dockerfile/#stopsignal).

USER

“The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use as the default user and group for the remainder of the current stage. The specified user is used for RUN instructions and at runtime, runs the relevant ENTRYPOINT and CMD commands.” Reference here.

Use it in the following ways:

USER <user>[:<group>]

USER UID[:GID]

VOLUME

When you use the run command to run an image you can supply the -v option to connect a directory on the host with a directory in the container. This allows you to permanently store data, for example in a database.

The VOLUME instruction sets a kind of default for this, so even when you do not use the -v option you won’t loose data. What Docker does is that it creates an ‘anonymous’ directory on the host and a directory in the container. These two are connected, ie the host directory is mounted on the image directory.

The single argument you provide for VOLUME is the path of the directory inside the container. But Docker will create another directory on the host as well, with the form:

/var/lib/docker/volumes/<random-id>/_data

The <random-id> part makes this an ‘anonymous’ volume. If you would add the -v option to your run command like this:

-v pgdata:/var/lib/postgresql/data

Then the data will end up on your host in:

/var/lib/docker/volumes/pgdata/_data

And if you provide this as -v option:

-v /home/user/mydb:/var/lib/postgresql/data

You will have your data on the host here:

/home/user/mydb

Important here is that the /var/lib/postgresql/data folder in the container is specifically defined by the official PostgreSQL Dockerfile. You can find literally this line in the Dockerfile:

VOLUME /var/lib/postgresql/data

WORKDIR

“The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY and ADD instructions that follow it in the Dockerfile. If the WORKDIR doesn’t exist, it will be created even if it’s not used in any subsequent Dockerfile instruction.”

See here.

ChatGPT provided this example:

FROM ubuntu:22.04
WORKDIR /app

COPY myfile.txt .   # goes into /app/myfile.txt inside the image
RUN ls              # runs inside /app

Basic commands

Docker can be fairly well understood when you know the following commands:

docker run
docker build

There are many more, you can find them here.

docker run

The run command presupposes the existence of an image. Any run command requires an image as argument, more specifically the last argument unless that image requires some arguments itself. This image can be an image from Docker Hub or one that you got from your colleague. The typical form is:

docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Take this example, in which a self-generated image is used (‘myapp’):

docker run -d -p 8080:8080 -v /home/user/data:/app/data myapp:latest

Or this one, that starts up a container with a PostgreSQL database:

docker run --name my-postgres \
  -e POSTGRES_USER=testuser \
  -e POSTGRES_PASSWORD=testpass \
  -e POSTGRES_DB=testdb \
  -v pgdata:/var/lib/postgresql/data \
  -p 5432:5432 \
  -d postgres:16

Note that both these run commands end with the name of an image, not followed by some command or arguments. Most often this is sufficient but there are occassions where you want to provide some command with arguments and where the image accomodates this:

docker run --rm python:3.11 python -c "print(3 * 7)"  // runs Python and prints 21

docker run --rm postgres:17 postgres --version  // prints the postgres version

docker run --rm alpine:3.19 ls -l /bin  // prints a directory listing of /bin inside the container

Relevant option flags

A complete overview of flags can be found here:

flag	meaning
–name	Give the container a name
-d	Container will run in background. Not connected to some value
-p	Publish a container’s port to the host. First number is port within Docker, second is the port for external access
-v	Bind a mount volume. For persistence, connects path within container to path in the underlying OS
-e	Set environment variable
–rm	Automatically remove container and its associated anonymous volumes when it exits

Data persistence

The reason you want the -v flag is when the data must be safely stored. You cannot safely store it within the container, as the container is a process that disappears upon shutdown. The binding of a directory inside the container to a directory outside of the container in the underlying OS means that the inner directory is a direct reference to the external directory on the host. They are the same.

Use of environment variables

Environment variables can be set, which is useful in combination with a Spring application.properties files that refers to specific environment variables for database credentials. In case of the official PostgreSQL image, the maintainers of this ‘Docker Official Image’ have explicitly included a script in which the POSTGRES_USER, POSTGRES_DB and POSTGRES_PASSWORD environment variables are being used. You must set them with the proper names otherewise the image won’t work. It is written in their documentation, ChatGPT jsut advised me to read it.

docker build

Once you have an image you can run a build command to create the corresponding image. This is the basic command:

docker build -t myname:sometag .

The dot at the end indicates that Docker expects to find the Dockerfile in the current directory. The -t flag lets you set a name and a tag. See here for extra build options.

PostgreSQL on Linux

Bash scripting