Login With Github

5 Universal Methods For Simplifying Docker Images

There are many benefits to simplifying Docker images, for it will not only save storage space and bandwidth, but also reduce security risks. There are many ways to optimize the size of an image, depending on the underlying development language used by the service. I'll introduce several general methods for simplifying Docker images in this article.

How Important It Is to Simplify the Size of Docker Images

Docker images consist of many Layers (up to 127 layers). The image layer relies on a series of underlying technologies, such as filesystems, copy-on-write, union mounts, and so on. And you can check out the Docker community documentation to learn more. In general, each instruction in the Dockerfile will create an image layer, which will increase the size of the overall image.

Here are the benefits of simplifying the size of Docker images:

  1. reduce building time
  2. reduce disk usage
  3. reduce download time
  4. enhance security due to fewer files included
  5. increase deployment speed

Five Suggestions of Reducing the Size of Docker Images

1. Optimize The Base Image

The method of optimizing the base image is to select a suitable and smaller base image. Generally, Ubuntu, CentOs, and Alpine are all commonly used Linux system images, among which Alpine is recommended. The size comparison is as follows:

lynzabo@ubuntu ~/s> docker images
REPOSITORY         TAG             IMAGE ID            CREATED             SIZE
ubuntu             latest        74f8760a2a8b        8 days ago          82.4MB
alpine             latest        11cd0b38bc3c        2 weeks ago         4.41MB
centos               7           49f7960eb7e4        7 weeks ago         200MB
debian             latest        3bbb526d2608        8 days ago          101MB
lynzabo@ubuntu ~/s>

Alpine, which is highly streamlined and includes basic tools, is a lightweight Linux distribution. The base image is only 4.41M. Each development language and framework has a base image based on Alpine, so it's highly recommended.

Looking at the comparison above, you can see that the smallest size is 4.41M. Are there any ways to build a smaller image? The answer is yes, for example the image of gcr.io/google_containers/pause-amd64:3.1 is only 742KB. Why is the size of the image so small? Let's have a look at the two basic images first:

1.1 scratch images

Scratch is an empty image which only can be used to build other images. For example, if you want to run a binary file containing all dependencies, such as a Golang program, you can use scratch as the base image. Now let's show you the Google pause images Dockerfile which is mentioned above:

FROM scratch
ARG ARCH
ADD bin/pause-${ARCH} /pause
ENTRYPOINT ["/pause"]

Google pause images use scratch as the base image. The image itself is not space-consuming. The size of the image which uses scratch as the base image is almost as small as the binary file itself, so the image is very small. Of course it also will be used in our Golang programs. For some Golang/C programs, you may need to rely on some dynamic libraries. You can use automatic extraction dynamic library tools, such as ldd, linuxdeployqt, etc. to extract all dynamic libraries, and then package the binary files together with the dependent dynamic libraries into the images.

1.2 busybox images

Scratch is an empty image. If you want the image to include some common Linux tools, the busybox image is a good choice. The image itself is only 1.16M, which is very convenient for building a small image.

2. Tandem Dockerfile Instructions

When you're defining a Dockerfile, if you use too many RUN instructions, it will often lead to a very large number of layers, which will make the image very bloated, and even encounter the problem of exceeding the maximum number of layers (127 layers). So according to the Dockerfile best practices, we should combine multiple commands in tandem into one RUN (implemented by the operators && and /). Each RUN should be designed carefully to ensure that the installation build will be cleaned up finally, so the image size can be reduced and the use of the build cache can be maximized.

Here is a Dockerfile before being optimized:

FROM ubuntu

ENV VER     3.0.0  
ENV TARBALL http://download.redis.io/releases/redis-$VER.tar.gz  
==> Install curl and helper tools...
RUN apt-get update  
RUN apt-get install -y  curl make gcc  
==> Download, compile, and install...
RUN curl -L $TARBALL | tar zxv  
WORKDIR  redis-$VER  
RUN make  
RUN make install  
...
==> Clean up...
WORKDIR /  
RUN apt-get remove -y --auto-remove curl make gcc  
RUN apt-get clean  
RUN rm -rf /var/lib/apt/lists/*  /redis-$VER  
...
CMD ["redis-server"]

Build an image with the name of test/test:0.1.

Let's optimize the Dockerfile and the optimized Dockerfile will be:

FROM ubuntu

ENV VER     3.0.0  
ENV TARBALL http://download.redis.io/releases/redis-$VER.tar.gz

RUN echo "==> Install curl and helper tools..."  && \  
apt-get update                      && \
apt-get install -y  curl make gcc   && \
echo "==> Download, compile, and install..."  && \
curl -L $TARBALL | tar zxv  && \
cd redis-$VER               && \
make                        && \
make install                && \
echo "==> Clean up..."  && \
apt-get remove -y --auto-remove curl make gcc  && \
apt-get clean                                  && \
rm -rf /var/lib/apt/lists/*  /redis-$VER
...
CMD ["redis-server"]

Build an image with the name of test/test:0.2.

Compare the sizes of the two images:

root@k8s-master:/tmp/iops# docker images
REPOSITORY       TAG           IMAGE ID            CREATED             SIZE
test/test        0.2         58468c0222ed        2 minutes ago       98.1MB
test/test        0.1         e496cf7243f2        6 minutes ago       307MB
root@k8s-master:/tmp/iops#s

It can be seen that the image size constructed by tandeming multiple RUN commands is one-third of using RUN instructions to each command.

Note: In order to cope with so many image layers in the image, Docker version 1.13 and above provide the function of squashing images, which will compress all operations in the Dockerfile into a layer. This feature is still in the experimental phase, and it is not enabled by default in Docker. If you want to enable it, you need to add the -experimental option when starting Docker and add --squash when the Docker build is building the image. We do not recommend the approach, you'd better follow the best practices when writing Dockerfile, and not try to compress the image in this way.

3. Use Multi-stage Builds

Each instruction in the Dockerfile will add an image layer to the image, and you need to clean up the unwanted components before moving to the next image layer. In fact, there is a Dockerfile for development (which contains everything needed to build an application) and a thin client for production (which only contains your application and the content you need to run it). Multi-phase builds are supported after Docker 17.05.0-ce. You can use multiple FROMstatements in a Dockerfile through multi-stage builds. And each FROM instruction can use a different base image, so you can selectively COPY the service component from one stage to another, leaving only what is needed in the final image.

Here is a Dockerfile that uses COPY --from and FROM ... AS ...:

# Compile
FROM golang:1.9.0 AS builder
WORKDIR /go/src/v9.git...com/.../k8s-monitor
COPY . .
WORKDIR /go/src/v9.git...com/.../k8s-monitor
RUN make build
RUN mv k8s-monitor /root
Package
Use scratch image
FROM scratch
WORKDIR /root/
COPY --from=builder /root .
EXPOSE 8080
CMD ["/root/k8s-monitor"]

When building the image, you will find that the generated image only contains the content specified by the above COPY instruction, and the image size is only 2M. So where I should use two Dockerfiles (a Dockerfile for development and a thin client for production) before can be replaced with a multi-stage build now.

4. Skills For Building Business Service Images

When building the image in Docker, if the content related to a command has not changed, it will use the file layer of the last cache. When constructing the business image, the points you should pay attention to are as follows:

  • Larger dependent libraries which are unchanged or rarely changed should be separated from the frequently modified code;
  • Because the cache is cached on the local machine running the Docker build command, it is recommended to use a machine for Docker build to take advantage of the cache.

Below is an example of building a Spring Boot application image to illustrate how to layer. Other types of applications, such as Java WAR package, Nodejs' npm module, etc., can be implemented by a similar approach.

4.1 Unzip the jar package generated by maven in the directory where the Dockerfile is located.

$ unzip <path-to-app-jar>.jar -d app

4.2 Let's divide the content of the application into 4 parts and COPY them into the image: the first 3 are basically unchanged, and the 4th is the code which will be changed frequently. And the last line is the way to start the spring boot application after unzipping.

FROM openjdk:8-jre-alpine

LABEL maintainer "opl-xws@xiaomi.com"
COPY app/BOOT-INF/lib/ /app/BOOT-INF/lib/
COPY app/org /app/org
COPY app/META-INF /app/META-INF
COPY app/BOOT-INF/classes /app/BOOT-INF/classes
EXPOSE 8080
CMD ["/usr/bin/java", "-cp", "/app", "org.springframework.boot.

It can improve the build speed greatly when building the image.

5. Other Optimization Methods

5.1 Perform apt, apk or yum class tools in the RUN command

If you execute the apt, apk, or yum class tools in the RUN command, you can make use of the tricks provided by these tools to reduce the number of image layers and the size of the image. Let's take a look at some examples:

(1) If you add the option of - no-install-recommends when executing apt-get install -y , then you can also achieve the same effect by adding the option --no-cache when executing apk add without installing the suggestive (not-essential) dependencies;

(2) When executing yum install -y, you can install multiple tools at the same time, such as yum install -y gcc gcc-c++ make .... You can reduce the number of image layers by placing all yum install tasks on one RUN command.

(3) Component installation and cleaning should be concatenated in one instruction, such as apk --update add php7 && rm -rf /var/cache/apk/*, because each instruction of Dockerfile will generate a file layer. And if the apk add ... and rm -rf ... commands are separated, the cleaning won't reduce the size of the file layer generated by the apk command. Ubuntu or Debian can use rm -rf /var/lib/apt/lists/* to clean up cache files in the image; systems such as CentOS use the command of yum clean all to clean up.

5.2 Compress Images

Some of the commands, such as export and import, which Docker comes with can also help compress images.

$ docker run -d test/test:0.2
$ docker export 747dc0e72d13 | docker import - test/test:0.3

To use the method, you need to run the container first, and some information of the original image will be lost in the process, such as export port, environment variables, and default instructions.

View the history information of the two images, you can see that all the image layer information is lost in test/test:0.3:

root@k8s-master:/tmp/iops# docker history test/test:0.3
IMAGE               CREATED             CREATED BY          SIZE                COMMENT
6fb3f00b7a72        15 seconds ago                          84.7MB              Imported from -
root@k8s-master:/tmp/iops# docker history test/test:0.2
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
58468c0222ed        2 hours ago         /bin/sh -c #(nop)  CMD ["redis-server"]         0B       
1af7ffe3d163        2 hours ago         /bin/sh -c echo "==> Install curl and helper…   15.7MB   
8bac6e733d54        2 hours ago         /bin/sh -c #(nop)  ENV TARBALL=http://downlo…   0B       
793282f3ef7a        2 hours ago         /bin/sh -c #(nop)  ENV VER=3.0.0                0B       
74f8760a2a8b        8 days ago          /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B       
<missing>           8 days ago          /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B
<missing>           8 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$…   2.76kB
<missing>           8 days ago          /bin/sh -c rm -rf /var/lib/apt/lists/*          0B
<missing>           8 days ago          /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   745B    
<missing>           8 days ago          /bin/sh -c #(nop) ADD file:5fabb77ea8d61e02d…   82.4MB   
root@k8s-master:/tmp/iops#

There are a lot of compression tools in the community, such as Docker-squash, which is simpler and more convenient to use, and will not lose the information of the original image. You may try it if interested.

Summarize

The simplifying methods for Docker images are worthy of in-depth discussion and practice. Hope that you will take away something from the article. And if you have any better method and advice, you're free to leave a comment.

1 Comment

temp

Thank you for great information about docker optimization