Book Review: Antivirus Bypass Techniques

This is the review of Antivirus Bypass Techniques book.

(My) Conclusion

This book is a niche subject book, to be more precise it’s about a niche tool (antivirus) used in a niche domain (endpoint security) of cybersecurity.

As his name implies, it describes how antivirus products are working and different technique to evade this antivirus products.

The book it’is not very technical (compared with a programming book) but it implies some knowledge of Windows OS architecture, Assembler and Python.

If you are new to this subject (like myself) it’s very good introduction that will give you a rather technical glimpse of the mouse and cat “game” played between the antivirus developers and malware creators.

If you are already working in the endpoint security domain you (probably) already know all the techniques presented in the book.

1.Introduction to the Security Landscape

This chapter is exploring the following topics:

  • definition of different malware types:
    • Virus: A malware type that replicates itself in the system.
    • Worm: A type of malware whose purpose is to spread throughout a network and
      infect computers connected to that network.
    • Rootkit: A type of malware that is found in lower levels of the operating system that
      tend to be highly privileged.
    • Downloader: A type of malware whose function is to download and run from the
      internet some other malicious files.
    • Ransomware: A type of malware whose purpose is to encrypt computer files and
      demand financial ransom from the user before they can access their files.
    • Botnet: Botnet malware causes the user to be a small part of a large network of
      infected computers.
    • Backdoor: A type of malware whose purpose is  to leave open a “back door”, providing the attacker with ongoing access to the user’s computer.
    • PUP: An acronym that stands for potentially unwanted program, a name that
      includes malware whose purpose is to present undesirable content to the user, for
      instance, ads.
    • Dropper: A type of malware whose purpose is to “drop” a component of itself into
      the hard drive.
    • Scareware: A type of malware that presents false data about the computer it is
      installed on, so as to frighten the user into performing actions that could be
      malicious, such as installing fake antivirus software or even paying money for it.
    • Trojan: A type of malware that performs as if it were a legitimate, innocent
      application within the operating system.
    • Spyware: A type of malware whose purpose is to spy on the user and steal their
      information to sell it for financial gain.
  • definition of different protection system types:
    • EDR (Endpoint Detection and Response): The purpose of EDR systems is to protect the business user from malware
      attacks through real-time response to any type of event defined as malicious.
    • Firewall: A system for monitoring, blocking, and identification of network-based
      threats, based on a pre-defined policy.
    • IDS/IPS (Intrusion Detection and Protection System): IDS and IPS provide network-level security, based on generic signatures, which inspects network packets and searches for malicious patterns or malicious flow.
    • DLP (Data Loss Prevention): DLP’s sole purpose is to stop and report on sensitive data exfiltrated from the organization, whether on portable media (thumb drive/disk on key), email, uploading to a file server, or more.
  • the basics of an antivirus product. Most of the antivirus products have different types of engines:
    • static engine: Conducts comparisons of existing files within the operating system against a database of signatures, and in this way can identify malware.
    • dynamic engine: Is checking the files at runtime using API monitoring (the goal of API monitoring is to intercept API calls in the operating system and to detect the malicious ones) and sand-boxing (A sandbox is a virtual environment that is separated from the memory of the physical host computer. This allows the detection and analysis of malicious software by executing it within a virtual environment)
    • heuristic engine: This type of engine determines a score for each file by conducting a statistical analysis that combines the static and dynamic engine methodologies.
    • unpacker engine: Unpacking is the process of restoring the original malware code; the malicious code was “packed” in order to hide a malicious patterns and thus thwart
      signature-based detection. The unpacker engine is able to detect if a file contains a (known) unpacker code.

2.Before Research Begins

In order to evade the antivirus products you must have a good understanding about how the different antivirus program components are working. The authors are using different tools (on Windows OS only) that usually are used for malware analysis to discover the working mechanics of the AVG Antivirus.

The authors are using the following tools :

  • Process Explorer is a tool that will will provide us with a lot of relevant information about the processes that are running in the operating system like the file name of the processes, the percentage of the CPU resources for the processes, the amount of memory and RAM allocated to the processes. Using the Process Explorer it is possible for example to find the hook that are used by the antivirus software to conduct monitoring on every process that exists within the operating system. This hook is usually a DLL file that is injected into every process running within the operating system
  • Process Monitor is a tool that can be used to observe the behavior of each process in the operating system from the moment when are started until the moment are closed. Using the Process Monitor is possible for example to find the processes used by the antivirus software for specific tasks like scanning a specific file.
  • Autoruns is a tool that shows what programs are configured to run during system bootup or login, and when you start various built-in Windows applications. With Autoruns is possible to use filters to find for example all the antivirus software files that are loaded at the startup of the operating system.
  • Regshot is an open source tool that lets you take a snapshot of your registry, then compare two registry shots, before and after installing a program. In this case it is used to find all the registry changes that took place after installing the antivirus software

3.Antivirus Research Approaches

The authors are proposing two methods to bypass the  antivirus software:

  • Find and exploit a vulnerability in the antivirus software
  • Find and use a detection bypass method

This chapter gives a few details about the first method; basically it presents a few vulnerabilities on different antivirus software packages that had impact on the way the antivirus was functioning:

  • Insufficient permissions on the static signature file. The file containing static signature had insufficient permissions meaning that any low-privileged user could modify the content of the file.
  • Unquoted service path. When a service is created within the Windows operating system and the executable path contains spaces and the path is not enclosed within quotation marks, the service will be susceptible to an Unquoted Service Path vulnerability.
    To exploit this vulnerability, an executable file must be created in a particular location in the service’s executable path, and instead of starting up the antivirus service, the service we created previously will load first and cause the antivirus to not load during operating system startup
  • DLL hijaking. When software wants to load a particular DLL, it uses the LoadLibraryW() Windows API call. It passes as a parameter to this function the name of the DLL it wishes to load. It is not recommended to use the LoadLibrary() function, due to the fact that it is
    possible to replace the original DLL with another one that has the same name, and in that
    way to cause the program to run our DLL instead of the originally intended DLL.

If you are interested of other types of vulnerabilities linked to the antivirus products you can look into the CVE MITRE database using the keyword antivirus.

4.Bypassing the Dynamic Engine

As explained in the first chapter the dynamic engine is checking the runtime behavior of files using API monitoring and sand-boxing. The authors are presenting two types of techniques for bypassing the dynamic engine:

 Bypass using process Injection

Process injection goal is to inject a piece of code into the process memory address space of another process, give this memory address space
execution permissions, and then execute the injected code. The general steps of a process injection are:

  1. Identify a target process.
  2. Receive a handle for the targeted process to access its process address space.
  3. Allocate a virtual memory address space where the code will be injected and
    executed, and assign an execution flag if needed.
  4. Perform code injection into the allocated memory address space of the targeted
    process.
  5. Execute the injected code.

The authors are presenting three process injections techniques; there are a lot more techniques, for a non-exhaustive list you can check MITRE Pricess Injection Techniques :

  • DLL Injection DLL injection is commonly performed by writing the path to a DLL in the virtual address space of the target process before loading the DLL by invoking a new thread. The write can be performed with native Windows API calls such as VirtualAllocEx and WriteProcessMemory, then invoked with CreateRemoteThread (which calls the LoadLibrary API responsible for loading the DLL).
  • Process hollowing Process hollowing is commonly performed by creating a process in a suspended state then unmapping/hollowing its memory, which can then be replaced with malicious code.
  • Process doppelganging This technique is using the Windows Transactional NTFS (TxF) API. TxF was introduced in Vista as a method to perform safe file operations.To ensure data integrity, TxF enables only one transacted handle to write to a file at a given time. Until the write handle transaction is terminated, all other handles are isolated from the writer and may only read the committed version of the file that existed at the time the handle was opened. Adversaries may abuse TxF for replacing the memory of a legitimate process, enabling the veiled execution of malicious code.

Bypass using timing-based techniques

Timing based techniques are based on the fact that antivirus vendors prefer to scan about 100,000 files in 24 minutes, with a detection rate of about 70%, over scanning the same number of files in 24 hours, with a detection rate of around 95%.

The first technique will utilize Windows API calls that cause the delay of the malware functionality, so the dynamic engine will not be able to spot the malware because it is not executed in a timely manner. A basic technique to  implement this behavior is by using the sleep() function combined with the GetTickCount() function.

The usage of the sleep() function only can be detected by the antivirus static engine and then antivirus emulator (used by dynamic engine) will simulate the pass of the sleep time thus bypassing the malware defense mechanism. The  usage of GetTickCount() (which returns the amount of time the operating system has been up and running) will counter this (time forward) emulation because the malware will be able to detect it.

The second technique is named by the authors the memory bombing and take advantage of the limited time that antivirus software has to dedicate to each individual file during scanning.

The pseudo-code for this technique looks like:

int main(){
char *memory_bombing = NULL;

//Initialize the memory_bombing variable with a bunch of
//zeroes.
//At this point, the antivirus is struggling to scan the
//file and forfeits

memory_bombing = (char *) calloc(200000000, sizeof(char));

if(memory_bombing != NULL) {
//free the memory allocated to memory_bombing
free(memory_bombing);
payload();
}
return 0;
}

The logic behind this type of bypass technique relies on the dynamic antivirus engine
scanning for malicious code in newly spawned processes by allocating virtual memory so
that the executed process can be scanned for malicious code in a sandboxed environment.
The allocated memory is limited because antivirus engines do not want to impact the user
experience so if the antivirus engine have to allocate a large amount of memory the antivirus engines will not scan the file.

5.Bypassing the Static Engine

The static engine is using file signatures to spot malicious files so, a lot of antiv-viruses are embedding the YARA tool; the chapter contains also a small introduction to YARA templates.

There are three ways to by pass the static engine:

    • Code Obfuscation is the process of making applications difficult or impossible to de-compile or disassemble, and make the application code more difficult to parse. The code obfuscation could defeat the ARA templates which are looking for specific strings into the files.
    • Encryption In this case the malicious functionality of the malware will be encrypted and appear as a harmless piece of code , meaning the antivirus software will treat it as such and will allow the malware to successfully run on the system.
      But before malware starts to execute its malicious functionality, it needs to decrypt its
      code within runtime memory. Only after the malware decrypts itself will the code be
      ready to begin its malicious actions. There are different encryption techniques used by the malwares:
      • Oligomorphic code includes several decryptors that malware can use. Each time it runs on the system, it randomly chooses a different decryptor to decrypt itself.
      • Polymorphic code mostly uses a polymorphic engine that usually has two roles. The first role is choosing which decryptor to use, and the second role is loading the relevant source code so that the encrypted code will match the selected decryptor.
      • Metamorphic code is code whose goal is to change the content of malware each time it runs, thus causing itself to mutate.
    • Packing A packer is a tool used to mask a malicious file. In general, packers work by taking an EXE file and obfuscating and compressing the code section (“.text” section) using a predefined algorithm. Following this, packers add a region in the file referred to as a stub, whose purpose is to unpack the software or malware in the operating system’s runtime memory and transfer the execution to the original entry point (OEP). The OEP is the entry point that was originally defined as the start of program execution before packing took place.The authors are presenting how the UPX and AsPAck packers are packaging a file and how unpacker will have to work in order to detect the content of the original file.

6.Other Antivirus Bypass Techniques

This chapter presents other bypass techniques:

  • Binary patching: It consists in opening/executing a binary through a debugger (the x32dbg/x64dbg in the book example), changing (on the fly) some code and then re-generating a new binary using the “Patch File” functionality of the debugger. This technique would defeat the static engine.
  • Timestomping; It consists in changing some metadata of a binary file like the created date. This relies on the fact that the creation date could be used for computing static signatures of different files so changing the creation date could defeat a static engine.
  • Junk code. Technique very similar to the Code Obfuscation technique presented in the chapter 5 Bypassing the static engine. The Junk code technique could also add empty functions, or loading non-existing files that could confuse the dynamic engine.
  • PowerShell. It consists in executing a payload directly from powershell; The powershell binary being a trusted file then the dynamic engine might be bypassed. 
  • Single malicious functionality If the static and the dynamic engine are not able to decide if a file is malicious then the heuristic engine will try to compute a score for the scanned file. The heuristic engine have a detection threshold under which a scanned file will not be marked as malicious even if it contains some potentially malicious components. The goal for the malware developer/s is to find the maximum number of malicious actions that will be under the detection threshold of the heuristic engine. 

7.Antivirus Bypass Techniques in Red Team Operations

This chapter is for me rather badly named; it starts by explaining what are the responsibilities and the goals of a red team and how it use the techniques presented into this book in the context of pen tests.

But, the main part if the chapter presents how a malware can check what antivirus products are installed on the endpoints that it wants to attack in order to apply the right bypass techniques, action that the authors are calling the fingerprinting of the antivirus software.

Antivirus fingerprinting can be done based on identifiable constants, such as the following:  Service names (for example, WinDefend is for example the service name for Microsoft Defender), Process names (or example, AVGSvc.exe is the process name of the AVG antivirus), Domain names, Registry keys or Filesystem artifacts.

The authors are recommending the following GitHub repository ethereal-vx/Antivirus-Artifacts  to find more details about Antivirus fingerprinting

8.Best Practices and Recommendations

The last chapter of the book can be split in 2 parts. The first part presents some controls that the antivirus providers could implement in order to mitigate some (not all of them) of the bypass techniques presented in previous chapters.

To mitigate the DLL hijacking vulnerability (a DLL is loaded using his name; see chapter 3 for more explanations) a proper mechanism to validate the loaded DLL module should be implemented. This validation should use not only the DLL name but also by a certificate and a signature.

To mitigate the Unquoted service path vulnerability (an executable path contains spaces and the path is not enclosed within quotation marks; see chapter 3 for more explanation) the solution is simply to wrap quotation marks around the executable path of the service.

For improving the antivirus detection the authors are proposing to use “dynamic” YARA. The goal of the “dynamic” YARA is to scan for potentially malicious strings and code at the memory level, on a dumped memory snapshot. Normally YARA is used by the static engine for file signature but in the case of “dynamic” YARA the template engine is used to look at the memory where the malware has been already de-obfuscated, unpacked, and decrypted.

Another best practice consists in the usage of Antimalware Scan Interface (AMSI) by the application developers.Windows Antimalware Scan Interface (AMSI) is an API that allows custom applications and services to integrate with any antimalware product that’s present on a machine.

The second part of the chapter contains some secure coding recommendations which can be applicable in SDLC of any type of software: Do not use old code, Input validation (of the AntiVirus UI), Read and fix the compiler warnings, Automated code testing, Use integrity validation for the static signature files download.

How to upload (big) files to Jenkins job as build parameter

Context

Originally, Jenkins had a mechanism to upload files as build parameters but this mechanism was rather faulty (see JENKINS-27413 and JENKINS-29289 ).

A new mechanism was proposed (for Jenkins2 only) via the File Parameter plug-in. The plug-in offers the possibility to capture files as build parameters like this:

def fb64 = input message: 'upload', parameters:  [base64File('file')]
node {
    withEnv(["fb64=$fb64"]) {
        sh 'echo $fb64 | base64 -d'
    }
}

Problem

If you look closer to the File Parameter plug-in documentation it said that: “You can use Base64 parameters for uploading small files in the middle of the build”. What does it means “small files” in terms of size is not mentioned but if you try the previous example with files bigger than 2 kBytes then the job will fail with the following error:

java.io.IOException: error=7, Argument list too long
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
        at java.lang.ProcessImpl.start(ProcessImpl.java:134)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
...
Caused: java.io.IOException: Cannot run program "nohup" (in directory "/var/jenkins_cache/workspace/testproject"): error=7, Argument list too long

Solution

What is the root cause of this exception ? I’m not exactly sure but I think that  sh echo $fb64 | base64 -dcommand will transfer as environment variable the file to the Jenkins slave executing the job and something into this transfer mechanism is not very robust.

I propose two ways to workaround this problem:

Solution 1: Don’t send the uploaded file as environment variable to sh

Don’t send the uploaded file as environment variable to ‘sh‘ and write the file directly into the workspace:

withEnv(["fb64=$fb64"]) {
    script{
        def  decoded = new String(fb64.decodeBase64())
        writeFile file:"uploaded_file.txt", text: decoded
        sh 'cat ${WORKSPACE}/uploaded_file.txt'
    }

The drawback of this solution is that you’ll have to write the uploaded file somewhere into your workspace, so if you want to store it into another location then you’ll have to add some extra steps to the pipeline.

Solution 2: Don’t use withEnv pipeline step

The second solution is not using the withEnv pipeline step and just directly use the sh echo $fb64 | base64 -dcommand from a script step:

script{
         sh "set +x; echo '$fb64' | base64 -d > /tmp/uploaded_file.txt"
         cat '/tmp/uploaded_file.txt'
      }

Please note that I’m using the “set +x” before the echo command in order to inhibit the output of the command so the Jenkins console/log is not filled-in with base64 encoded characters. Also in this solution you have the freedom to chose the destination of the uploaded file.

Book Review: Secure by Design

This is the review of the Secure by Design  book.

(My) Conclusion

I would definitively add this book to the list of (software) security books that every software engineer should read (see “5 (software) security books that every (software) developer should read”) and I would put it on the first place. This book does not treat software security in a classic way but from  software design point of view. The main idea of the book is that a good software design will drastically improve the application security posture.

For me this book could be seen as an extension of the Domain-Driven Design: Tackling Complexity in the Heart of Software book but applied to software security. The main audience of the book is any software engineer and security professionals that are working with the development teams to help them to have a better security posture.

1: Why Design Matters for Security

The fist chapter explains why when developing software centered on design, security will become a natural part of the development process instead of being perceived as a forced requirement.

The traditional approach to software security have e few shortcomings; the user have to explicitly think about security and it have to be knowledgeable in different security topics. On the other side driving security through design can have the following advantages:

  • Software design is central to the interest and competence of most developers.
  • By focusing on design, business and security concerns gain equal priority in the view of both business experts and developers.
  • By choosing good design constructs, non-security experts are able to write secure code.
  • By focusing on good domain design, many security bugs are solved implicitly.

2: Intermission: The anti-Hamlet

This chapter (which is based on a real case) presents an example of how a flaw in designing a model of an bookstore e-shop application negatively impacted the business.

The mistake done in the model was to represent the quantity of items from a shopping card as an integer, so the users of the application could add negative numbers of items so at the end the customers could receive money from the bookstore.

3: Core concepts of Domain-Driven Design

The chapter starts with the definition of the Domain Driven Design (DDD) and describing what are the qualities of a domain model to be effective:

  • Be simple so you focus on the essentials.
  • Be strict so it can be a foundation for writing code.
  • Capture deep understanding to make the system truly useful and helpful.
  • Be the best choice from a pragmatic viewpoint.
  • Provide you with a language you can use when you talk about the system.

The main notions from DDD that can be beneficial in the context of a more robust model are:

Entities

Entities are objects representing a thread of continuity and identity, going through a lifecycle, though their attributes may change.

Entities are one type of model objects that have some distinct properties. What makes
an entity special is that:

  • It has an identity that defines it and makes it distinguishable from others.
  • It has an identity that’s consistent during its life cycle.
  • It can contain other objects, such as other entities or value objects (see further for a value object definition).
  • It’s responsible for the coordination of operations on the objects it owns.

Value Objects

Value objects are objects describing or computing some characteristics of a thing.The key characteristics of a value object are as follows:

  • It has no identity that defines it, but rather it’s defined by its value.
  • It’s immutable.
  • It should form a conceptual whole.
  • It can reference entities.
  • It explicitly defines and enforces important constraints.
  • It can be used as an attribute of entities and other value objects.
  • It can be short-lived.

Aggregates

An aggregate is a conceptual boundary used to group parts of the model together. The purpose of this grouping is to treat the aggregate as a unit. The key characteristics of a aggregates are:

  • Every aggregate has a boundary and a root.
  • The root is a single, specific entity contained in the aggregate.
  • The root is the only member of the aggregate that objects outside the boundary
    can hold references to.
  • Objects within the aggregate can hold references to other aggregates.

Bounding context

Multiple models are in play on a large project; it’s possible to have two or more models having the same concepts but with different semantics. In the case of different models, there is a need to define explicitly the scope of a particular model as a bounded part of a software system. A bounded context delimits the applicability of a particular model.

Data crossing a semantic boundary is of special interest from a security perspective because this is where the meaning of a concept could implicitly change.

4: Code constructs promoting security

Problems areas addressed and the proposed constructs:

Problem Section
Security problems involving data integrity and availability Immutable objects
Security problems involving illegal input and state Design by Contract
Security problems involving input validation (Input) Validation

Immutable objects

Immutable objects are safe to share between threads and open up high data availability which is an important aspect when protecting a system against denial of service attacks. Immutable object could protect against security problems involving availability of a system.

Mutable objects, on the other hand, are designed for change, which can lead to illegal updates and modifications. Immutable objects will enforce the integrity of the data of an application.

Design by Contract

Design By Contract (see Meyer, Bertrand: Applying “Design by Contract”) is an approach for designing software that uses preconditions and post-conditions to document (or programmatically assert) the change in state caused by a piece of a program. Thinking about design in terms of preconditions and contracts helps you clarify which part of a design takes on which responsibility.

Many security problems arise because one part of the system assumes another part takes responsibility for something when, in fact, that part assumes the opposite.

The authors are presenting some example of checking preconditions for method arguments and constructors. The goal is to fail if the contract is not met and the program is not using the classes in a way they were designed to be used. The program has lost control of what’s happening, and the safest thing to do is to stop as fast as possible.

(Input) Validation

In the case of input validation the authors are going through a framework that tries to separate the different kinds of (input) validation. The list presented also suggests a good order in which to do the different kinds of validation. Cheap operations like checking the length of data come early in the list, and more expensive operations that require calling the database come later. If one the steps is failing then the entire validation process must fail.

Different validation steps:

  • Origin – Is the data from a legitimate sender?
    • Origin checks can be done by checking the origin IP or requiring an access token
  • Size  – Is the size of the data in line with the context on which the data is used?
  • Lexical content  – Does it contain the right characters and encoding?
    • When checking the lexical content of data, the important part is the content not the structure so, the data is scanned to see that it contains the expected characters and the expected encoding.
  • Syntax – Is the format right?
  • Semantics – Does the data make sense from the business point of view?

5: Domain primitives

Problems areas addressed:

Problem Section
Security issues caused by inexact, error-prone, and
ambiguous code
Domain primitives
Security problems due to leakage of sensitive data Read-once objects

Domain primitives

Domain primitives are similar to value objects in Domain-Driven Design. Key difference is and they must be enforced at the point of creation. Also the usage of language primitives or generic types (including null ) are forbidden to represent concepts in the domain model because it could caused inexact, error-prone, and ambiguous code.

At the creation of the domain primitives the different validation steps could be applied as explained into the previous chapter; see (Input) Validation section of chapter 4: Code constructs promoting security

A typical example of a domain primitive is a quantity (see the example from the chapter 2: Intermission: The anti-Hamlet) that should not be defined as a primitive type (a float or an int) but as a distinguish type that will contains all the necessary logic for creation of valid (from the domain point of view) instances of quantity type.

For example in the context of a book shop a quantity which is negative or a not an integer greater is not valid from the business domain point of view.

Read-once objects

A read-once object is an object designed to be read once (or a limited number of times). This object usually represents a value or concept in your domain that’s considered to be sensitive (for example, passport numbers, credit card numbers, or passwords). The main purpose of the read-once object is to facilitate detection of unintentional use of the data it encapsulates.

Here’s a list of the key aspects of a read-once object:

  • Its main purpose is to facilitate detection of unintentional use.
  • It represents a sensitive value or concept.
  • It’s often a domain primitive.
  • Its value can be read once, and once only.
  • It prevents serialization of sensitive data.
  • It prevents sub-classing and extension.

6: Ensuring integrity of state

This chapter it’s about the integrity of the DDD entities objects.Entities contains the state that represents the business rules so it is important that a newly created entity follow the business rules.

The first goal is to have entities already consisted at the creation time. This can be done forcing the object creation through a constructor with all mandatory attributes and optional attributes set via method calls. This works very well for simple business rules; for more complex business rules the usage of the Builder pattern is advised.

The second goal is to keep the entities consistency after the creations time during the usage of the entities by other software components. The main idea is to share only final attributes (that cannot be changed), not share mutable objects and use immutable domain primitives.

In the case of attributes containing collections, should not expose a collection but rather expose a useful property of the collection (for example to add an item into a collection, add a method that receive as parameter the item to be added). Collection can be protected by exposing an non modifiable version (see Collections.unmodifiableCollection)

7: Reducing complexity of state

This chapter is extending the discussion from the previous chapter and it presents how to handle DDD entities objects that can have multiple states. For example an entity representing an order can have a few valid states like “paid”, “shipped”, “lost” or “delivered”. Keeping the state of entities controlled becomes hard when entities become complex, especially when there are lots of states with complex transitions between them.

The authors are proposing 3 patterns to handle the entities state complexity:

  • Entity state object
    • The proposal is to have entity state be explicitly designed and implemented as a class of its own. With this approach, the state object is used as a delegated helper object for the entity. Every call to the entity is first checked with the state object. This approach makes it easier to grasp what states the entity can have.
  • Entity Snapshot
    • The pattern consist of generating immutable objects called snapshots from the an entity. The clients will use the snapshots for the read only operations. For changing the state of the underlying entity, the clients will have to use a domain service to which they’ll have to send updates.
    • A drawback of this approach is that it violates some of the ideas of object orientation, especially the guideline to keep data and its accompanying behavior close together, preferably in the same class.
    • From the security point of view this pattern it improves the integrity because because the snapshot is immutable so there’s no risk at all of the representation mutating to a foul state.
  • Entity relay
    • This pattern is to be used in the case when the entity have a big number of possible states with a complex graph of changing states. The basic idea of entity relay is to split the entity’s lifespan into phases, and let each entity represent its own phase. When a phase is over, the entity goes away, and another kind of entity takes over—like a relay race.

8: Leveraging your delivery pipeline for security

The chapter treats different test strategies that could be applied in order to have a better security posture.

For the unit tests, the authors propose to divide the tests into:

  • normal testing – Verifies that the design accepts input that clearly passes the domain rules
  • boundary testing – Verifies that only structurally correct input is accepted. Examples of boundary checks are length, size, and quantity,
  • invalid input testing – Verifies that the design doesn’t break when invalid input is handled. Empty data structures, null, and strange characters are often considered invalid input.
  • extreme input testing – Verifies that the design doesn’t break when extreme input is handled. For example, such input might include a string of 40 million characters.

Other topics covered are :

  • testing of feature toggles that can cause security vulnerabilities. A good rule of thumb is to create a test for every existing toggle and should test all possible combinations using automated tests.
  • testing of the availability of the application by simulating DOS attacks.

9: Handling failures securely

The chapter treats different topics around handling failures and program exceptions.

It’s a good practice to separate business exceptions and technical exceptions. For business exception the best practice is to create exception having a business meaning.

As a practice to avoid, shouldn’t intermix technical and business exceptions using the same type and never include business data in technical exceptions, regardless of whether it’s sensitive or not.

Another interesting idea is to not handle business failures as exceptions. A failure should be modeled as a possible result of a performed operation in the same way a success is. By designing failures as unexceptional outcomes, it’s possible to avoid the problems that come from using exceptions including ambiguity between domain and technical exceptions, and inadvertently leaking sensitive information.

Resilience and responsiveness are attributes of a system that are improving the system availability. To achieve this attributes the authors are presenting 2 patterns:

  • circuit breaker pattern – Circuit Breaker allows graceful handling of failed remote services. It’s especially useful when all parts of an application are highly decoupled from each other, and failure of one component doesn’t mean the other parts will stop working.
  • bulkhead pattern – The Bulkhead pattern is a type of application design that is tolerant of failure. In a bulkhead architecture, elements of an application are isolated into pools so that if one fails, the others will continue to function.

10: Benefits of cloud thinking

This chapter is treating design concepts to be used for achieving a better security posture in the context of cloud deployments.

The most important concept it’s the “The three R’s of enterprise security“. The methodology of three Rs is: Rotate, Repave and Repair and it offers a simple approach towards greater security of cloud deployments.

The basic idea is to be proactive than be reactive as seen in traditional enterprise security. Speed is of essence. The longer a deployment stays in a given configuration, the greater is the opportunity for threats to exploit any vulnerabilities.

  • Rotate: Rotate secrets every few minutes or hours. Rotating secrets doesn’t improve the security of the secrets themselves, but it’s an
    effective way of reducing the time during which a leaked secret can be misused.
  • Repave: Repave servers and applications every few hours.Recreating all servers and containers and the applications running on them from a known good state every few hours is an effective way of making it hard for malicious software to spread through the system.
  • Repair: Repair vulnerable software as soon as possible after a patch is available. This goes for both operating systems and applications third party dependencies. The reason for repairing as often as you can is that for every new version of the software, something will have changed so an attacker constantly needs to find new ways to break it.

11: Intermission: An insurance policy for free

This chapter is very similar with the chapter 2, Intermission: The anti-Hamlet. It presents a real case (of an insurance company) that migrated a monolithic application to a micro-service application.

Due to this migration, the application was split into 2 different micro-services handled by 2 different teams. Having 2 independent teams handling different parts of the application and some functional changes in one of the micro-services will have as impact that the notion of Payment will have different meanings for the 2 micro-services. This miss-match will generate some subtle bugs even if none of the 2 systems were not broken.

12: Guidance in legacy code

This chapter is a kind of review of all the practices described in previous chapters that are applicable to legacy code.

It treats about the usage of domain primitives (see chapter 5 Domain primitives) to replace ambiguous parameters in APIs which are a common source of security bugs, the usage of read-once objects (see chapter 5 Domain primitives) which limits the number of times a sensitive values can be accessed allowing it to detect unintentional access, the usage of security tests that are testing look for invalid and extreme inputs (see chapter 8 Leveraging your delivery pipeline for security)

13: Guidance in micro-services

This chapter is very similar with the previous one but the context is the new approach of writing applications using micro-services.

Implementing security for a micro-service architecture is more difficult that in a case of a monolithic architecture because of the loose coupling of micro-services.

Splitting a monolithic application to different micro-services is rather a difficult task but a good design principle is to think of each service as a bounded context (see chapter 3 Core concepts of Domain-Driven Design for definition of bounded context).

Analyzing confidentiality, integrity, availability, and traceability across all services and data sensitivity is more difficult than in a case of classical architecture. The only way to treat this security topics in a complete way is to have a broader view of the entire applications and not only on a subset of the micro-services.

14: A final world: Don’t forget about security!

The entire book was talking about how to not think about security, but still getting a good security posture anyway. This chapter speaks about how important is to think and learn about the security anyway and it gives advises that could be found in more “classical” security books:

  • Should use code security reviews as a recurring part of secure development lifecycle (SDLC)
  • It is important to invest in tooling that provides quick access to information about security vulnerabilities across the technological entire stack.
  • Penetration tests should be done recurrently and the feedback from this tests should be used as an opportunity to improve the application design.
  • Having a team and processes to handle security incidents and the security incident mechanism should focus on learning to become more resistant to attacks.

 

.

7 ways to build slimmer/lighter (Linux) containers

The goal of this ticket is to present a few ways to obtain lighter container images. But why it’s so important to build and use lighter containers ?

Lighter containers means :

  • less disk space used to store the images
  • faster transfer (pull/push) of the images to/from the container registry,
  • faster build process of images and easier to update them (because it contains less components)
  • better security posture (less components, less vulnerabilities, smaller attack surface).

The hints that I will present could be sorted in two different categories: what to put into an image (to be lighter) and how to build an image (to be lighter).

What to put into your image

1. Use the lighter base image as possible

Choose the base image based of your needs of you application and try to use the minimal base image. If for example your application is Java based then choose as base image something like openjdk:19-slim-buster not a base image containing Java + other components that you don’t need. Following this approach is almost effortless but you will depend of the (base) image maintainer for any updates.

A better, but more difficult and more time consuming approach is to start from a bare minimal image like Alpine or Red Hat Universal Base Image 8 Minimal and install on top whatever components/packages you need. Following this approach will give you much more flexibility because you will be able to patch the needed components as the pace of their update; the drawback is that you have to spend some time creating the Dockerfile that builds the needed image.

2. Use multi-stage build

With multi-stage builds you can use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. For a very good explanation of this feature you can see the Docker documentation.

The example given in the Docker documentation is around compiling a Go application into a stage and just copy the desired artifacts into another stage that will be used in the final image.

To illustrate the multi-stage build I will use as example Java 9 and the jlink tool that generates a custom Java runtime image that contains only the platform modules that are required for a given application:

FROM openjdk:11.0.14-jdk AS initial_jdk

# build a custom JRE
RUN jlink --add-modules java.management,java.base,java.logging,java.naming,java.sql,java.xml \
 --output ./customJre/ --strip-debug --no-man-pages --no-header-files \
 --compress=2

# use as base image the ubi minimal
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230

# copy the custom JRE into the final image
COPY --from=initial_jdk ./customJre /opt/java/openjdk

ENV JAVA_HOME=/opt/java/openjdk \
    PATH="/opt/java/openjdk/bin:$PATH"

3. Deactivate the package manager cache

Different package managers are copying the installed dependencies also in cache folders so it’s not needed to re-download a dependency if is necessary to be re-installed. Obviously, in the case of containers the cache feature should be deactivated or the cache folders should be deleted after the dependencies installation.

A few examples of package managers and how to deactivate or delete the cache:

  • pip cache purge – Remove all items from the cache.
  • dnf clean – Performs cleanup of temporary files kept for repositories. This includes any such data left behind from disabled or removed repositories as well as for different distribution release versions.
  • microndnf clean
  • yum clean – Same definition as dnf clean

Here is an example of a Dockerfile with and without the usage of the cache clean:

#No dnf Clean
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230

RUN microdnf install fontconfig \
&& microdnf install libXtst
#With dnf clean
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230

RUN microdnf install fontconfig \
&& microdnf install libXtst \
&& microdnf clean all

And here are the size of the two images:

The usage of deactivation of package manager cache should be combined with either hint number 4 (Minimize the number of RUN, COPY, ADD instructions) or hint number 5 (Use the squash flag of docker/podman build).

How to build a lighter image

This hints are around the container UnionFS (Union File System) and will explain how to create less or smaller image layers.

4. Minimize the number of RUN, COPY, ADD instructions

Only the instructions RUN, COPY, ADD create layers; each usage of one of this instructions will create a new layer into the final image. Minimizing the number of this instructions will minimize the number of image layers which will minimize the size of the final image.

Let’s use the following Dockerfile as (faulty) example:

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230
# call twice the RUN instruction
RUN microdnf install fontconfig 
RUN microdnf install libXtst

In this Dockerfile we called twice the RUN instruction; the image (having an id starting with 14e7) will have 4 layers:

docker inspect --format '{{join .RootFS.Layers "\n "}}' 14e7

sha256:44f62afd0479b4c2059f2a01b61a33a6e47b0a903b17a9fd65a8df8d4cfe806c
sha256:87cd41b1f9f880f62765bc510b9f241c5532cb919182ba453d87a28783b24d5b
sha256:acf320641a3c8165491b3b022d088ce7170820dbcaf31789db9b9b8a55568594
sha256:9c29e387846f1413e91046c9c194c9556ee4a66d993aa56a7ad7ecbe78304dbd

Now let’s minimize the number of RUN instructions; we will have a single RUN instruction containing multiple install commands:

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.5-230
# call RUN only once
RUN microdnf install fontconfig && \ 
    microdnf install libXtst

The new image (having the id starting with d73) will have 3 layers:

docker inspect --format '{{join .RootFS.Layers "\n "}}' d73
sha256:44f62afd0479b4c2059f2a01b61a33a6e47b0a903b17a9fd65a8df8d4cfe806c
sha256:87cd41b1f9f880f62765bc510b9f241c5532cb919182ba453d87a28783b24d5b
sha256:5ad98570b3807cbd9dd51fd981e2c15d2fc7793061441ea395d3f332b722af35

5. Use the squash flag of docker/podman build

The squash flag is a flag of the docker build command which is still experimental that will squash newly built layers into a single new layer.

Podman build command also have a similar flag; Podman also have a squash-all flag that will squash all of the new image’s layers (including those inherited from a base image) into a single new layer.

6. Use .dockerignore to filter the content of Docker build context

The .dockerignore file is used to filter the content that will be used by the Docker build context to create an image.

The goal of this feature is just to have a faster build process ( because less files will be present in the build context) but it can help also in the case when accidentally the Dockerfile defines more files than needed.

7. Use external tools

I have to admit using external tools to obtain a slimmer image should not be the default or preferred solution especially after docker and podman implemented the squash flags. But if is not possible to use the existing solutions then here are some free tools that you could try:

  • jwilder/docker-squash
    • docker-squash is a utility to squash multiple docker layers into one in order to create an image with fewer and smaller layers.
    • it looks very similar to the docker build and podman build squash flags
    • project looks not active anymore
  • goldmann/docker-squash
    • can squash last n layers from an image
    • can squash from a selected layer to the end
    • project looks still active
  • docker-slim/docker-slim
    • docker-slim try to figure it out what files are useful from the target image by running a container of the target image.
    • docker-slim is capable to run static or dynamic analysis; it also capable to probe the running container using http requests.
    • docker-slim contains also a linter for Dockerfiles; Running the linter on the “No dnf Clean” Dockerfile from the hint nr3 (Deactivate the package manager cache) give the following results:
docker-slim lint

Conclusion

As you could see there are a few ways to create lighter images; some of the hints are “low hanging fruits” and can be applied systematically, like the usage of the squash flag (hint nr. 5) and the minimization of RUN commands (hint nr. 4); some others demand a little bit of thinking and try and error, like the usage of the right base image (hint nr. 1) or the usage of multi-stage builds (hint nr. 2).

Book Review: Container Security

This is the review of the Container Security book.

(My) Conclusion

I have mixed feelings about this book; to a scale of 1 to 10 I would give it a 7.

What I appreciated about it:

  • You can have a free (digital) copy of the book from here Aqua Container Security.
  • All the Linux security mechanisms that are used under the hood by containers are very well explained with multiple (valuable) examples; namespaces, cgroups, capabilities, system calls, AppArmor, SecComp. At the end of the day, container security is just a subset of Linux security.
  • No hidden (or un-hidden) publicity to any commercial tools, despite the fact that the author is working for AquaSecurity company.
  • A lot of references towards Internet accessible resources; unfortunately, the author is using url shortening so I wish you good luck to copy them into a browser if you have the paper version of the book.
  • Clear and concise writing style.

What I think could have been done better:

  • Even if the book is about security of/in containers, there is no general introduction of the container notion or the actual container landscape.
  • A lot of forward references in different chapters; usually in technical books you find backward references because (very often) the knowledge is build on top of the knowledge of previous chapters.
  • There are a few chapters which are very thin, especially toward the end; the last chapter (chapter 14) for example is just 2 pages long.
  • There is a companion website (https://containersecurity.tech/) but it contains just a single page.

1. Container Security Threats

This chapter defines different attack vectors for the containers and the infrastructure that they are running on. This attack vectors specifically linked to containers are:

  • Application code vulnerabilities
  • Badly configured images
  • Badly configured containers
  • Build Image attack
  • Supply chain attack
  • Vulnerable hosts
  • Exposed secrets
  • Insecure networking
  • Container runtime vulnerabilities
Containers attack vectors

The containers very often are deployed on cloud infrastructures very often using a multi-tenant model which brings new threats and new attack vectors on top of previous ones.

After presenting and explaining the problems that usage of containers will bring the author is focusing on (security) general guidelines that should be used when implementing different mitigations controls:

  • least privilege
    • each container should have a minimum set of permissions to fulfill it’s function.
  • defense in depth
  • reducing the attack surface
    • split the monolithic application in smaller, simpler microservices that would imply a less complex architecture that would reduce the attack surface.
  • limiting the blast radius
    • if one container is compromised some controls should be put in place to not affect the others software components
  • segregation of duties
    • permissions and credentials can be passed only into the containers that need them

2. Linux System Calls, Permissions and Capabilities

This chapter it presents the basics of Linux System calls, the Linux file permissions (an extensive explanation is done on the usage of of setuid and getuid) and the Linux Capabilities. For each of this Linux features some examples are given and the author emphasizes that this capabilities are heavily used by the containers and the containers run-times because at the end of the day, a container is just a Linux process running on a host.

3. Control Groups

This chapter is very similar with the previous one in the sense that it does not speak about containers but about a Linux security feature that is heavily used by the containers. This chapter is dedicated to Linux control groups (a.k.a cgroups) which have as goal to limit the resources, such as memory, CPU, network input/output, that a process or a group of processes can use.

Containers runtimes are using cgroups behind the scene to limit resources used by containers, so cgroups provides protection against a class of attacks that attempt to disrupt running applications by consuming excessive resources, thereby starving legitimate applications.

4. Container Isolation

This chapter treats another Linux feature that is cornerstone for container security: Linux namespaces.

Linux namespaces are a feature of the Linux kernel that partitions kernel resources such that one or more processes sees one set of resources while another set of processes sees a different set of resources. If cgroups control the resources that a process can use, namespaces control what it can see.

For each of the existing namespaces (Unix Timesharing System, Process IDs, Mount Points, Network, Users and Group Ids, Inter-Process Communications) the author shows how can be created from command line. For some namespaces a comparison is done between the isolation implemented by a container runtime and the isolation offered just using the tools offered out of the box by Linux.

5. Virtual Machines

This chapter is an introduction to virtual machines. It is explained different types of hypervisors (a.k.a VMM – Virtual Machine Monitor):

  • Type1 – the hypervisor is installed directly on top of the hardware with no operating system underneath (ex: Hyper-V, Xen)
  • Type2 – the hypervisor is installed on top of a Host Os (ex: VirtualBox, Parallels, QEMU)
  • Kernel Based Virtual Machines – this is a kind of hybrid type because it consists in a hypervisor running within the kernel of the hos Os (ex: Linux KVM).
Different types of hypervisors

After describing the types of hypervisors the author explained how the hypervisors are achieving the virtualization via a mechanism called “trap and emulate“. When an OS is running as a virtual machine in a hypervisor, some of its instructions may conflict with the host operation system. So the hypervisor will emulates the effect of that specific instruction or action without carrying it out. In this way, the host OS is not effected by the guest’s actions.

The chapter is concluded with the advantages of hypervisors for process isolation compared with the kernel processes (which are the cornerstone of containers) and the main drawbacks of hypervisors.

From the process isolation point of view the hypervisors are offering a greater isolation and the difference is that hypervisors have a simpler job to fulfill comparing with OS kernels. In a kernel, user space processes are allowed some visibility of each other, but there is no sharing of memory or sharing of processes in the case of hypervisors.

On the drawback side, the VMs have start-up times that are several orders of magnitude greater than a container, containers give developers a convenient ability to “build once, run anywhere” quickly and efficiently, each virtual machine has the overhead of running a whole kernel compared with containers that are sharing a kernel so containers can be very efficient in both resource use and performance.

6. Container Images

This chapter is focusing on the images; it starts by explaining the OCI standards covering the image specification. In this chapter you will be able to see how different topics from previous chapters (namespaces, capabilities, control groups, root file system) are fitting together so the end user can define, build and execute a container.

The second par of the chapter is focusing on different attack vectors on an image:

Image Attack Vectors

Some of this attack vectors are not really linked to container technology (tamper source code, vulnerable dependencies, attack deployment via build machine) but others are container specific attack vectors (tamper the docker file, usage of vulnerable base images, modify images during build).

7. Software Vulnerabilities in Images

The chapter is dedicated to vulnerabilities managements in general and also in the context of containers. For the general/generic part, the author explains what is the workflow when a vulnerability is discovered:

after the discovery the person the new issue will get a unique identifier that begins with “CVE” (Common Vulnerabilities and Exposures) , followed by the year and an unique id.

  • A responsible security disclosure is agreed between the entity that found the vulnerability and the entity that “owns” the software. Both parties agree on a timeframe after which the researcher can publish their findings.
  • The entity that “owns” the software is fixing the vulnerability and delivers a patch.
  • Once the vulnerability can be disclosed, it receive a unique identifier that begins with “CVE,” which stands for Common Vulnerabilities and Exposures.

Strangely enough, the author does not mention the usage of CVSS (Common Vulnerability Scoring System) score of a vulnerability. Usually CVSS score is used to judge the impact of the vulnerability.

The second part of the chapter is focusing on ways to handle the vulnerability management in the context of containers. A few interesting and valuable ideas:

  • (always) use immutable containers :
    • If containers are downloading code at runtime, different instances of the container could be running different versions of that code, but it would be difficult to know which instance is running what version.
    • It’s harder to control and ensure the provenance of the software running in each container if it could be downloaded at any time and from anywhere.
    • Building a container image and storing it in a registry is very simple to automate in a CI/CD pipeline.
  • regular scan of images.
    • Regularly re-scanning container images allows the scanning tool to check the contents against its most up-to-date knowledge about vulnerabilities. A very common approach is to re-scan all deployed images every 24 hours, in addition to scanning new images as they are built, as part of an automated CI/CD pipeline.
  • use a tool that can do more than scanning for vulnerabilities (if possible). A (non-exhaustive) list of extra features that the scanner could have:
    • Known malware within the image
    • Executables with the setuid bit
    • Images configured to run as root
    • Secret credentials such as tokens or passwords
    • Sensitive data in the form of credit card or Social Security numbers or something similar

8. Strengthening Container Isolation

This chapter is an extension of the Chapter 4 (Container Isolation); it presents other ways to extend the container isolation using mechanisms and framework beyond the Linux kernel features.

The first part of the chapter presents mechanisms already present in Linux ecosystem that can be used in other contexts than containers, namely:

  • Seccomp
    • Seccomp is a mechanism for restricting the set of system calls that an application is allowed to make.
    • The Docker default seccomp profile blocks more than 40 of the 300+ syscalls (including all the examples just listed) without ill effects on the vast majority of containerized applications. Unless you have a reason not to do so, it’s a good default profile to use.
  • AppArmor
    • In AppArmor, a profile can be associated with an executable file, determining what that file is allowed to do in terms of capabilities and file access permissions.
    • AppArmor implement mandatory access controls. A mandatory access control is set by a central administrator, and once set, other users do not have any ability to modify the control or pass it on to another user.
    • There is a default Docker AppArmor profile
  • SELinux
    • SElinux lets you constrain what a process is allowed to do in terms of its interactions with files and other processes. Each process runs under an SELinux domain and every file has a type.
    • Every file on the machine has to be labeled with its SELinux information before you can enforce policies. These policies can dictate what access a process of a particular domain has to files of a particular type.

In the second part of the chapter the author presents container specific technologies that could be used to enforce the containers isolation:

  • gVisor
    • gVisor provides a virtualized environment in order to sandbox containers. The system interfaces normally implemented by the host kernel are moved into a distinct, per-sandbox application kernel in order to minimize the risk of a container escape exploit.
    • To do this, a component of gVisor called the Sentry intercepts syscalls from the application. Sentry is heavily sandboxed using seccomp, such that it is unable to access filesystem resources itself. When it needs to make systemcalls related to file access, it off-loads them to an entirely separate process called the Gofer. Even those system calls that are unrelated to filesystem access are not passed through to the host kernel directly but instead are reimplemented within the Sentry. Essentially it’s a guest kernel, operating in user space.
  • Kata Containers
    • The idea with Kata Containers is to run containers within a separate virtual machine. This approach gives the ability to run applications from regular OCI format container images, with all the isolation of a virtual machine.
    • Kata uses a proxy between the container runtime and a separate target host where the application code runs. The runtime proxy creates a separate virtual machine using QEMU to run the container on its behalf.
  • Firecracker
    • Is a virtual machine offering the benefits of secure isolation through a hypervisor and no shared kernel, but with startup times around 100ms.
    • Firecracker designers have stripped out functionality that is generally included in a kernel but that isn’t required in a container like enumerating devices. The main saving comes from a minimal device model that strips out all but the essential devices.

9. Breaking Container Isolation

After explaining in previous chapters what can be done to enhance the container isolation, this chapter is focusing on how a container could be misconfigured so this isolation is broken.

The following misconfigurations are explained:

Run containers using the default (root) user.

Unless your container image specifies a non-root user or you specify a non default user when you run a container, by default the container will run as root.

The best option is to define a custom user inside the container but if this option is not available then a few other options are presented:

  • override the user id; this is possible in Docker using the –user flag of the docker run command.
  • use user namespaces (covered in chapter 2) within the container, so that root inside the container is not the same as root on the host. You can enable the use of user namespaces in Docker, but it’s not turned on by default. If you’re interested about how to do it please take a look to Isolate containers with a user namespace

The use of —priviledged flag

The usage of priviledged flag give extended (Linux) capabilities to the process representing the running container. Docker introduced the –privileged flag to enable DinD (Docker in Docker) which can be used by build tools(very often in the CI/CD context) running as containers, which need access to the Docker daemon in order to use Docker to build container images.

Mounting sensitive directories

Mounting inside the containers the root file system or specific host folders is not a very good idea. List of folders to avoid mounting:

  • Mounting /etc would permit modifying the host’s /etc/passwd file from within the container.
  • Mounting /bin, /usr/bin or /usr/sbin would allow the container to write executables into the host directory.
  • Mounting host log directories into a container could enable an attacker to modify or erase the logs.

Mounting the Docker Socket

In a Docker environment, there is a Docker daemon process that essentially does all the work. When you run the docker command-line utility, this sends instructions to the daemon over the Docker socket that lives at /var/run/docker.sock . Any entity that can write to that socket can also send instructions to the Docker daemon. The daemon runs as root and will happily build and run any software of your choosing on your behalf.

Accessing the Docker Daemon via REST API with no authentication

This in not really mentioned in the book (even that I think that it should) but it’s very similar with the previous paragraph. The docker daemon can be also accessed via a REST API; by default the API is accessible with no authentication.

Sharing namespaces between the container and the host

Containerized processes are all visible from the host; thus, sharing the process namespace to a container lets that container see the other containerized processes.

10. Container Network Security

The chapter starts with an introduction to ISO/OCI networking model and this model is used during the chapter to explain different topics related to network security. The author is focusing on explaining the networking model for containers running under Kubernetes orchestrator but even if you’re not interested on K8s it is still possible to find some technology agnostic best practices:

  • Default Deny Ingress: define a network policy that denies ingress traffic by default and then add policies to permit traffic only where you expect it
  • Default Deny Egress: Same as the Ingress part.
  • Restricts ports: Restrict traffic so that it is accepted only to specific ports for each application.

11. Securely Connecting Components with TLS

Most of the chapter content have noting to do with containers (this is highlighted even by the author itself) and is treating the history of SSL/TLS protocol and the basics of PKI : Public/Private Key, X509 certificates, Certificate Signing Requests, Certificate Revocation and Certificate Authorities.

The only piece of information linked to containers that I found important is the that rather than writing your own code to set up secure connections, you can choose to use a service mesh to do it for you.

12. Passing Secrets to Containers

The chapter starts by enumerating properties that a secret must have:

  • it should be stored in encrypted form so that it’s not accessible to every user or entity.
  • it should never be written to disk unencrypted (and even better just held it in memory and never write it on disk).
  • it should be revocable (make them invalid in the event that the secret should no longer be trusted).
  • it should be able to rotate it.
  • it should be independent of the lifecycle of the consumers.
  • only software components that need the access to it should be able to read the secret

Next paragraph enumerates different ways of injecting information (secrets included) into containers:

  • store the information into the image
    • obviously this is not a very good idea for secrets because can be accessed by anyone having the image and it cannot be changed unless the images are rebuild.
  • use environment variables as part of the configuration that goes along with the image
    • same problems as the hard-coded secrets
  • pass the secret over the network
    • the running container will make the appropriate network calls to retrieve or receive the information.
    • in this case the date(secret) in transit should be encrypted, most probably using a service mesh
    • the principal drawback of this approach is how the container will be able to authenticate to this service offering the secret; the author does not offer any solution
  • pass the secrets at runtime using the environment variables.
    • environment variables defined for the container can be seen using different commands like docker inspect.
  • pass the secrets through files.
    • This option consists in write the secrets into files that the container can access through a mounted volume.
    • Combining this with a secure secrets store ensures that secrets are never stored “at rest” unencrypted.

I found this chapter rather strange because it explains how to not pass secrets to containers instead of presenting the good practices. Speaking about good practices, this are very briefly mentioned like the usage of a third-party (commercial) solution for secret storage. I would have preferred to have more insights on how this tools are working.

13. Container Runtime Protection

This chapter treats the controls to put in place in order to assure the protection of the running containers.

The first idea is to compute a container profile. This profile should be computed prior to the deployment of the container in live and should contains the normal behavior of the container. Once this profile is known, then at runtime a (container security)tool would be able to compare the profile with the real behavior of the container and detect any discrepancy.

This container profile could contains the following information:

  • network traffic – the other containers and or hosts that the container normally communicates with.
  • executable – what kind of commands the normal cunning container is executing. In this case the author suggests to use eBPF (which stands for extended Berkeley Packet Filter) technology.
  • file access – what files from the container file system are usually accessed.
  • user IDs – as a general rule, if the container is doing one job, it probably needs to operate under only one user identity.
  • (Linux) capabilities – the (minimal) list of capabilities the container needs in order to execute properly; any attempt to use a capability not present in the list should raise a red flag.

The second idea presented is the drift prevention. It’s considered best practice to treat containers as immutable. The container is instantiated from its image, and then the contents of the container should not change. In the case of drift prevention the (container security) tool will be able to make the difference between the software that came from the image, and the software that is running in the workload so it gives the ability to immediately stop any software that doesn’t belong to the (original) image.

14. Containers and the OWASP Top 10

This sounds a very interesting topic but unfortunately the author it expedite it very fast. In some of the cases the author is even confessing that the type of risk is not linked to containers and could be applied to non containers world also.

Same OWASP Top 10 (2017) have direct applicability in the container word:

  • Broken Authentication.
    • This can be linked with the usage of secrets in container word. These secrets need to be stored with care and passed into containers at runtime, as discussed in Chapter 12.
    • The containerized applications that must communicate between them would need to identify each other using certificates, and communicate using secure connections. This can be handled directly by containers, or you can use a service mesh
  • Broken Access Control
    • Some container-specific approaches to mitigate least privilege the abuse of privileges that may be granted unnecessarily to users or components:
      • Don’t run containers as root.
      • Limit the capabilities granted to each container.
      • Use seccomp, AppArmor, or SELinux.
      • Use immutable containers
  • Insufficient Logging and Monitoring
    • Following container events should be logged:
      • Container start/stop events
      • Access to secrets
      • Any modification of privileges
      • Modification of container payload
      • Inbound and outbound network connections
      • Volume mounts
      • Failed actions such as attempts to open network connections, write to files, or change user permissions.
  • Failed actions such as attempts to open network connections, write
  • to files