7 ways to build slimmer/lighter (Linux) containers

The goal of this ticket is to present a few ways to obtain lighter container images. But why it’s so important to build and use lighter containers ?

Lighter containers means :

  • less disk space used to store the images
  • faster transfer (pull/push) of the images to/from the container registry,
  • faster build process of images and easier to update them (because it contains less components)
  • better security posture (less components, less vulnerabilities, smaller attack surface).

The hints that I will present could be sorted in two different categories: what to put into an image (to be lighter) and how to build an image (to be lighter).

What to put into your image

1. Use the lighter base image as possible

Choose the base image based of your needs of you application and try to use the minimal base image. If for example your application is Java based then choose as base image something like openjdk:19-slim-buster not a base image containing Java + other components that you don’t need. Following this approach is almost effortless but you will depend of the (base) image maintainer for any updates.

A better, but more difficult and more time consuming approach is to start from a bare minimal image like Alpine or Red Hat Universal Base Image 8 Minimal and install on top whatever components/packages you need. Following this approach will give you much more flexibility because you will be able to patch the needed components as the pace of their update; the drawback is that you have to spend some time creating the Dockerfile that builds the needed image.

2. Use multi-stage build

With multi-stage builds you can use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. For a very good explanation of this feature you can see the Docker documentation.

The example given in the Docker documentation is around compiling a Go application into a stage and just copy the desired artifacts into another stage that will be used in the final image.

To illustrate the multi-stage build I will use as example Java 9 and the jlink tool that generates a custom Java runtime image that contains only the platform modules that are required for a given application:

FROM openjdk:11.0.14-jdk AS initial_jdk

# build a custom JRE
RUN jlink --add-modules java.management,java.base,java.logging,java.naming,java.sql,java.xml \
 --output ./customJre/ --strip-debug --no-man-pages --no-header-files \
 --compress=2

# use as base image the ubi minimal
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230

# copy the custom JRE into the final image
COPY --from=initial_jdk ./customJre /opt/java/openjdk

ENV JAVA_HOME=/opt/java/openjdk \
    PATH="/opt/java/openjdk/bin:$PATH"

3. Deactivate the package manager cache

Different package managers are copying the installed dependencies also in cache folders so it’s not needed to re-download a dependency if is necessary to be re-installed. Obviously, in the case of containers the cache feature should be deactivated or the cache folders should be deleted after the dependencies installation.

A few examples of package managers and how to deactivate or delete the cache:

  • pip cache purge – Remove all items from the cache.
  • dnf clean – Performs cleanup of temporary files kept for repositories. This includes any such data left behind from disabled or removed repositories as well as for different distribution release versions.
  • microndnf clean
  • yum clean – Same definition as dnf clean

Here is an example of a Dockerfile with and without the usage of the cache clean:

#No dnf Clean
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230

RUN microdnf install fontconfig \
&& microdnf install libXtst
#With dnf clean
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230

RUN microdnf install fontconfig \
&& microdnf install libXtst \
&& microdnf clean all

And here are the size of the two images:

The usage of deactivation of package manager cache should be combined with either hint number 4 (Minimize the number of RUN, COPY, ADD instructions) or hint number 5 (Use the squash flag of docker/podman build).

How to build a lighter image

This hints are around the container UnionFS (Union File System) and will explain how to create less or smaller image layers.

4. Minimize the number of RUN, COPY, ADD instructions

Only the instructions RUN, COPY, ADD create layers; each usage of one of this instructions will create a new layer into the final image. Minimizing the number of this instructions will minimize the number of image layers which will minimize the size of the final image.

Let’s use the following Dockerfile as (faulty) example:

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230
# call twice the RUN instruction
RUN microdnf install fontconfig 
RUN microdnf install libXtst

In this Dockerfile we called twice the RUN instruction; the image (having an id starting with 14e7) will have 4 layers:

docker inspect --format '{{join .RootFS.Layers "\n "}}' 14e7

sha256:44f62afd0479b4c2059f2a01b61a33a6e47b0a903b17a9fd65a8df8d4cfe806c
sha256:87cd41b1f9f880f62765bc510b9f241c5532cb919182ba453d87a28783b24d5b
sha256:acf320641a3c8165491b3b022d088ce7170820dbcaf31789db9b9b8a55568594
sha256:9c29e387846f1413e91046c9c194c9556ee4a66d993aa56a7ad7ecbe78304dbd

Now let’s minimize the number of RUN instructions; we will have a single RUN instruction containing multiple install commands:

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230
# call RUN only once
RUN microdnf install fontconfig && \ 
    microdnf install libXtst

The new image (having the id starting with d73) will have 3 layers:

docker inspect --format '{{join .RootFS.Layers "\n "}}' d73
sha256:44f62afd0479b4c2059f2a01b61a33a6e47b0a903b17a9fd65a8df8d4cfe806c
sha256:87cd41b1f9f880f62765bc510b9f241c5532cb919182ba453d87a28783b24d5b
sha256:5ad98570b3807cbd9dd51fd981e2c15d2fc7793061441ea395d3f332b722af35

5. Use the squash flag of docker/podman build

The squash flag is a flag of the docker build command which is still experimental that will squash newly built layers into a single new layer.

Podman build command also have a similar flag; Podman also have a squash-all flag that will squash all of the new image’s layers (including those inherited from a base image) into a single new layer.

6. Use .dockerignore to filter the content of Docker build context

The .dockerignore file is used to filter the content that will be used by the Docker build context to create an image.

The goal of this feature is just to have a faster build process ( because less files will be present in the build context) but it can help also in the case when accidentally the Dockerfile defines more files than needed.

7. Use external tools

I have to admit using external tools to obtain a slimmer image should not be the default or preferred solution especially after docker and podman implemented the squash flags. But if is not possible to use the existing solutions then here are some free tools that you could try:

  • jwilder/docker-squash
    • docker-squash is a utility to squash multiple docker layers into one in order to create an image with fewer and smaller layers.
    • it looks very similar to the docker build and podman build squash flags
    • project looks not active anymore
  • goldmann/docker-squash
    • can squash last n layers from an image
    • can squash from a selected layer to the end
    • project looks still active
  • docker-slim/docker-slim
    • docker-slim try to figure it out what files are useful from the target image by running a container of the target image.
    • docker-slim is capable to run static or dynamic analysis; it also capable to probe the running container using http requests.
    • docker-slim contains also a linter for Dockerfiles; Running the linter on the “No dnf Clean” Dockerfile from the hint nr3 (Deactivate the package manager cache) give the following results:
docker-slim lint

Conclusion

As you could see there are a few ways to create lighter images; some of the hints are “low hanging fruits” and can be applied systematically, like the usage of the squash flag (hint nr. 5) and the minimization of RUN commands (hint nr. 4); some others demand a little bit of thinking and try and error, like the usage of the right base image (hint nr. 1) or the usage of multi-stage builds (hint nr. 2).

OSGI: How to handle the (wrong) bundles startup order

Context

In some situations it is needed that some of the OSGI bundles are started in a specific order.

A concrete case is when the Apache Camel is used in the context of the OSGI. If one of the OSGI bundles are an user defined Apache Camel component and another bundle uses this user defined Camel component into a (Camel) route then the bundle containing the Camel component should be started before the bundle that is using the Camel component.

Solution – modify the bundle/s start level

The first solution would be to modify the start level of the bundle that you want to start later.

Apache Felix offers the “bundlelevel” command:

bundlelevel - set bundle start level or initial bundle start level
scope.
flags:
-i, --setinitial set the initial bundle start level
-s, --setlevel set the bundle's start level

So something like:

 bundlelevel -s newStartLevel bundleId

will do the trick.

The advantage  of this solution is that you do not need any programming skills to do it and you can apply it on any bundle (even on the bundles that you are not controlling the content).

The drawback of this solution is that is totally manual (at least in case of Apache Felix server).

The OSGI specification also  defines the OSGI Start Level API which provides the following functions:

  • Controls the beginning start level of the OSGi Framework.
  • Is used to modify the active start level of the Framework.
  • Can be used to assign a specific start level to a bundle.
  • Can set the initial start level for newly installed bundles.

Using the OSGI Start Level API it is possible to programmatically set the start level:

Bundle bundle = framework.getBundleContext().installBundle(location);
BundleStartLevel bundleStartLevel = bundle.adapt(BundleStartLevel.class);
bundleStartLevel.setStartLevel(xxx);

Solution – use a BundleListener

The basic idea is that the bundle B (than needs the bundle A to be active) will wait until the the bundle A is marked as started. This can be achieved by implementing a BundleListener to the level of bundle B.

The implementation of the “bundleChanged” method of the listener will look like this:

public void bundleChanged(BundleEvent bundleEvent) {
    String symbolicName = bundleEvent.getBundle().getSymbolicName();
    int eventType = bundleEvent.getType();

    if ("The Bundle A Symbolic Name".equals(symbolicName)
        && BundleEvent.STARTED == eventType) {
        //here we know that bundle A is started
        //so can do something that will need
        //bundle A
     }
}

The advantage of this approach is that the bundle developer is in control of the behavior. On the the other side this approach will not work if you do not own the code of the bundle that you want to start later.

How to programmatically set-up a (HTTP) proxy for a Selenium test

Context

In the context of a (Java) Selenium test it was needed to set-up a http proxy at the level of the browser. What I wanted to achieve it was exactly what is shown in the next picture but programmatically. In this specific case the proxy was BurpPro proxy but the same workflow can be applied for any kind of (http) proxy.

Solution

I know this is not really rocket science but I didn’t found elsewhere any clear explanation about how to do it. In my code the proxy url is injected via a (Java) system property called “proxy.url“.

And the  code looks like this:

String proxyUrl = System.getProperty("proxy.url");
if (proxyUrl != null) {
    Proxy proxy = new Proxy();
    proxy.setHttpProxy(proxyUrl);

    FirefoxOptions options = new FirefoxOptions();
    options.setProxy(proxy);
    
    driver = new FirefoxDriver(options);
} else {
    driver = new FirefoxDriver();
}

How to intercept and modify Java stacktraces

This ticket was triggered by a “simple” requirement: “Change all the package names in the logs of a Java application (especially the stacktraces) from ‘abc.efg’ (put here whatever you want as name) to ‘hij.klm’ (put here also whatever you want as name) “. The first idea that popped in my mind was to change the packages names at the code level, but this was not feasible because of (rather) big codebase, the use of the (Java) reflexion and the tight timeline.

In the following lines, I will discuss possible solutions to implement this (weird) requirement.

 

Extend the log4j ThrowableRenderer

If the project is using log4j1x as log library, then a solution would be to create your own throwable renderer by extending the org.apache.log4j.spi.ThrowableRenderer. The (log4j) renderers are used to render instances of java.lang.Throwable (exceptions and errors) into a string representation.

The custom renderer that replaces the packages starting with “org.github.cituadrian” by “xxx.yyy” will look like this:

package org.github.cituadrian.stacktraceinterceptor.log4j;

import org.apache.log4j.DefaultThrowableRenderer;
import org.apache.log4j.spi.ThrowableRenderer;

public class CustomThrowableRenderer implements ThrowableRenderer {
    private final DefaultThrowableRenderer defaultRenderer =  
                   new DefaultThrowableRenderer(); 
    
    @Override 
    public String[] doRender(Throwable t) {
      String[] initialResult = defaultRenderer.doRender(t); 
      for (int i = 0; i < initialResult.length; i++) { 
        String line = initialResult[i]; 
        if (line.contains("org.github.cituadrian")) { 
           initialResult[i] = line.replaceAll("org.github.cituadrian", "xxx.yyy"); 
        } 
       } 
      return initialResult; 
    }
}

Basically, the custom renderer is delegating the task of creating a String from a Throwable to a DefaultThrowableRenderer and then it checks and replace the desired package names.

In order to be used, the renderer should be defined in the log4j.xml file:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration debug="true"
    xmlns:log4j='http://jakarta.apache.org/log4j/'>
 
  <throwableRenderer class= 
     "org.github.cituadrian.stacktraceinterceptor.log4j.CustomThrowableRenderer"/>
...

Use a log4j2 pattern layout

If your project is using log4j2 as logging library, then you can use a (log4j2) layout pattern.  The layout pattern will look like:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
 <Appenders>
 <Console name="STDOUT" target="SYSTEM_OUT">
 <PatternLayout pattern=
  "%replace{%class %log %msg %ex}{org\.github\.cituadrian}{xxx\.yyy}"/>
 </Console>
...

 

Modify (a.k.a. Weaving) the java.lang.StackTraceElement class with AOP

Before even explaining what it really means, I have to warn you that weaving JDK classes is rarely necessary (and usually a bad idea) even if it’s possible using an AOP framework like AspectJ.

For this case I used the AspectJ as AOP framwork because the weaver (aop compiler) is able to do binary weaving, meaning the weaver takes classes and aspects in .class form and weaves them together to produce binary-compatible .class files that run in any Java VM. The command line to obtain a weaved jar is the following one:

ajc -inpath rt.jar Aspect.java -outjar weavedrt.jar

In the case of weaving JDK classes one extra step is necessary in order to make the application work; we must create a new version of the rt.jar file  or create just a small JAR file with the JDK woven classes which then must be appended to the boot-classpath of the JDK/JRE when firing up the target application. The command line to execute the target application is the following one:

java -Xbootclasspath/<path to weavedrt.jar>;<path to aspectjrt.jar> TargetApplication

If you don’t want to worry about all the technical details of weaving and executing the application and you are using Maven then you can use the (marvelous) SO_AJ_MavenWeaveJDK project from gitHub (that handles everything using Maven)

The aspect that will modify the stacktrace packages looks like:

package org.github.cituadrian.stacktraceinterceptor.app;

import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.annotation.Pointcut;
@Aspectpublic class StackTraceInterceptorAspect {
    @Pointcut("execution(String java.lang.StackTraceElement.getClassName()) "
            + "&& !within(StackTraceInterceptorAspect)")     
    public void executeStackTraceElementGetClassNamePointcut() {}        
    
    @Around("executeStackTraceElementGetClassNamePointcut()")    
    public Object executeStackTraceElementGetClassNameAdvice(    
                   final ProceedingJoinPoint pjp) throws Throwable {        
        Object initialResponse =  pjp.proceed();         
        if (initialResponse instanceof String 
               && ((String)initialResponse).startsWith("org.github.cituadrian")) {     
                 return ((String)initialResponse).replaceFirst("org.github.cituadrian", "xxx.zzz"); 
        }        
        return initialResponse;    
    }
 }

In a nutshell, the StackTraceInterceptorAspect will intercept all the calls to the java.lang.StackTraceElement#getClassName method and it will change the returned result of the method if the class name contains the string “org.github.cituadrian”.

If you are interested to learn more about AspectJ I really recommend you to buy a copy of the AspectJ in action (second edition) book.

 

Modify and shadow the java.lang.StackTraceElement class

 Using AOP just to intercept and modify a single method of a single class is a little bit over-killing. In this case there is another solution; the solution would be create a custom version of the java.lang.StackTraceElement class and add this custom class in the boot-classpath of the JDK/JRE when firing up the target application, so the initial version will be shadowed by the custom version.

An implementation of StacktraceElement class can be found here. So you can modify by hand the java.lang.StackTraceElement#getClassName method or the java.lang.StackTraceElement#toString method.

 To execute the target application, you must create a jar with the modified class and add it into the boot-classpath (something similar to the AspectJ solution):

java -Xbootclasspath/<path to custom class.jar> TargetApplication

 

 

How to remotely connect to an in-memory HSQLDB database

hsqldbVery often in-memory instances of HSQLDB are used in the context of unit tests; the unit test starts a database instance (eventually on a random port), provision the database with some data, run the test against the database end then stop it.

Now, from time to time you need to debug the unit tests and sometimes you also need to run manually some queries on the in-memory database (using HSQLDB manager or any other software).

So here are the steps in order to be able to connect to an in-memory HSQLDB instance:

  • Start the DB in the “remote open db” mode.This is driven by the “server.remote_open” property that is false by default (you can look to the code of org.hsqldb.server.ServerProperties class to see other properties that might be interesting). The code that starts a dataabse instacne in remote open mode will look something like:
HsqlProperties props = new HsqlProperties();
props.setProperty("server.remote_open", true);
....add more properties here
Server server = new Server();
server.setProperties(props);
server.start();
  • Connect to the data base instance. Now that the database has started the next step is to connect to data base instance. Because we started an in-memory instance the connection url is a memory database url that looks like jdbc:hsqldb:mem:instanceName You will not be able to connect using this url because neither the host and the port are available, instead you should use a server database url that looks like jdbc:hsqldb:hsql://host:port/instanceName