Book Review: API Security In Action

This is the review of the API Security in action book.

(My) Conclusion

This book is doing a very good job in covering different mechanisms that could be used in order to build secure (RESTful) APIs. For each security control the author explains what kind of attacks the respective control is able to mitigate.

The reader should be comfortable with Java and Maven because most of the code examples of the book (and there are a lot) are implemented in Java.

The diagram of all the security mechanism presented:

Part 1: Foundations

The goal of the first part is to learn the basics of securing an API. The author starts by explaining what is an API from the user and from developer point of view and what are the security properties that any software component (APIs included) should fill in:

  • Confidentiality – Ensuring information can only be read by its intended audience
  • Integrity – Preventing unauthorized creation, modification, or destruction of information
  • Availability – Ensuring that the legitimate users of an API can access it when they need to and are not prevented from doing so.

Even if this security properties looks very theoretical the author is explaining how applying specific security controls would fulfill the previously specified security properties. The following security controls are proposed:

  • Encryption of data in transit and at rest – Encryption prevents data being read or modified in transit or at rest
  • Authentication – Authentication is the process of verifying whether a user is who they say they are.
  • Authorization/Access Control – Authorization controls who has access to what and what actions they are allowed to perform
  • Audit logging – An audit log is a record of every operation performed using an API. The purpose of an audit log is to ensure accountability
  • Rate limiting – Preserves the availability in the face of malicious or accidental DoS attacks.

This different controls should be added into a specific order as shown in the following figure:

Different security controls that could/should be applied for any API

To illustrate each control implementation, an example API called Natter API is used. The Natter API is written in Java 11 using the Spark Java framework. To make the examples as clear as possible to non-Java developers, they are written in a simple style, avoiding too many Java-specific idioms. Maven is used to build the code examples, and an H2 in-memory database is used for data storage.

The same API is also used to present different types of vulnerabilities (SQL Injection, XSS) and also the mitigations.

Part 2: Token-based Authentication

This part presents different techniques and approaches for the token-based authentication.

Session cookie authentication

The first authentication technique presented is the “classical” HTTP Basic Authentication. HTTP Basic Authentication have a few drawbacks like there is no obvious way for the user to ask the browser to forget the password, the dialog box presented by browsers for HTTP Basic authentication cannot be customized.

But the most important drawback is that the user’s password is sent on every API call, increasing the chance of it accidentally being exposed by a bug in one of those operations. This is not very practical that’s why a better approach for the user is to login once then be trusted for a specific period of time. This is basically the definition of the Token-Based authentication:

Token Based authentication

The first presented example of Token-Based authentication is using the HTTP Base Authentication for the dedicated login endpoint (step number 1 from the previous figure) and the session cookies for moving the generated token between the client and the API server.

The author take the opportunity to explain how session cookies are working and what are the different attributes but especially he presents the attacks that are possible in the case of using session cookies. The session fixation attack and the Cross-Site Request Forgery attack (CSRF) are presented in details with different options to avoid or mitigate those attacks.

Tokens whiteout cookies

The usage of session cookies is tightly linked to a specific domain and/or sub-domains. In case you want to make requests cross domains then the CORS (Cross-Origin Resource Sharing) mechanism can be used. The last part of the chapter treating the usage of session cookies contains detailed explanations of CORS mechanism.

Using the session cookies as a mechanism to store the authentication tokens have a few drawbacks like the difficulty to share cookies between different distinguished domains or the usage of API clients that do not understand the web standards (mobile clients, IOT clients).

Another option that is presented are the tokens without cookies. On the client side the tokens are stored using the WebStorage API. On the server side the tokens are stored into a “classical” relational data base. For the authentication scheme the Bearer authentication is used (despite the fact that the Bearer authentication scheme was created in the context of OAuth 2.0 Authorization framework is rather popular in other contexts also).

In case of this solution the least secure component is the storage of the authentication token into the DB. In order to mitigate the risk of the tokens being leaked different hardening solutions are proposed:

  • store into the DB the hash of tokens
  • store into the DB the HMAC of the tokens and the (API) client will then send the bearer token and the HMAC of the token

This authentication scheme is not vulnerable to session fixation attacks or CSRF attacks (which was the case of the previous scheme) but an XSS vulnerability on the client side that is using the WebStorage API would defeat any kind of mitigation control put in place.

Self-contained tokens and JWTs

The last chapter of this this (second) part of the book treats the self-contained or stateless tokens. Rather than store the token state in the database as it was done in previous cases, you can instead encode that state directly into the token ID and send it to the client.

The most client-side tokens used are the Json Web Token/s (JWT). The main features of a JWT token are:

  • A standard header format that contains metadata about the JWT, such as which MAC or encryption algorithm was used.
  • A set of standard claims that can be used in the JSON content of the JWT, with defined meanings, such as exp to indicate the expiry time and sub for the subject.
  • A wide range of algorithms for authentication and encryption, as well as digital signatures and public key encryption.

A JWT token can have three parts:

  • Header – indicates the algorithm of how the JWT was produced, the key used to authenticate the JWT to or an ID of the key used to authenticate. Some of the header values:
    • alg: Identifies which algorithm is used to generate the signature
    • kid: Key Id; as the key ID is just a string identifier, it can be safely looked up in server-side set of keys.
    • jwk: The full key. This is not a safe header to use; Trusting the sender to give you the key to verify a message loses all security properties.
    • jku: An URL to retrieve the full key. This is not a safe header to use. The intention of this header is that the recipient can retrieve the key from a HTTPS endpoint, rather than including it directly in the message, to save space.
  • Payload/Claims – pieces of information asserted about a subject. The list of standard claims:
    • iss (issuer): Issuer of the JWT
    • sub (subject): Subject of the JWT (the user)
    • aud (audience): Recipient for which the JWT is intended
    • exp (expiration time): Time after which the JWT expires
    • nbf (not before time): Time before which the JWT must not be accepted for processing
    • iat (issued at time): Time at which the JWT was issued; can be used to determine age of the JWT
    • jti (JWT ID): Unique identifier; can be used to prevent the JWT from being replayed (allows a token to be used only once)
  • Signature – Securely validates the token. The signature is calculated by encoding the header and payload using Base64url Encoding and concatenating the two together with a period separator. That string is then run through the cryptographic algorithm specified in the header.
Example of JWT token

Even if the JWT could be used as self-contained token by adding the algorithm and the signing key into the header, this is a very bad idea from the security point of view because you should never trust a token sign by an external entity. A better solution is to store the algorithm as metadata associated with a key on the server.

Storing the algorithm and the signing key on the server side it also helps to implement a way to revoke tokens. For example changing the signing key it can revoke all the tokens using the specified key. Another way to revoke tokens more selectively would be to add to the DB some token metadata like token creation date and use this metadata as revocation criteria.

Part 3: Authorization

OAuth2 and OpenID Connect

A way to implement authorization using JWT tokens is by using scoped tokens. Typically, the scope of a token is represented as one or more string labels stored as an attribute of the token. Because there may be more than one scope label associated with a token, they are often referred to as scopes. The scopes (labels) of a token collectively define the scope of access it grants.

A scoped token limits the operations that can be performed with that token. The set of operations that are allowed is known as the scope of the token. The scope of a token is specified by one or more scope labels, which are often referred to collectively as scopes.

Scopes allow a user to delegate part of their authority to a third-party app, restricting how much access they grant using scopes. This type of control is called discretionary access control (DAC) because users can delegate some of their permissions to other users.

Another type of control is the mandatory access control (MAC), in this case the user permissions are set and enforced by a central authority and cannot be granted by users themselves.

OAuth2 is a standard to implement the DAC. OAuth uses the following specific terms:

  • The authorization server (AS) authenticates the user and issues tokens to clients.
  • The user also known as the resource owner (RO), because it’s typically their resources that the third-party app is trying to access.
  • The third-party app or service is known as the client.
  • The API that hosts the user’s resources is known as the resource server (RS).

To access an API using OAuth2, an app must first obtain an access token from the Authorization Server (AS). The app tells the AS what scope of access it requires. The AS verifies that the user consents to this access and issues an access token to the app. The app can then use the access token to access the API on the user’s behalf.

One of the advantages of OAuth2 is the ability to centralize authentication of users at the AS, providing a single sign-on (SSO) experience. When the user’s client needs to access an API, it redirects the user to the AS authorization endpoint to get an access token. At this point the AS authenticates the user and asks for consent for the client to be allowed access.

OAuth can provide basic SSO functionality, but the primary focus is on delegated third-party access to APIs rather than user identity or session management. The OpenID Connect (OIDC) suite of standards extend OAuth2 with several features:

  • A standard way to retrieve identity information about a user, such as their name, email address, postal address, and telephone number.
  • A way for the client to request that the user is authenticated even if they have an existing session, and to ask for them to be authenticated in a particular way, such as with two-factor authentication.
  • Extensions for session management and logout, allowing clients to be notified when a user logs out of their session at the AS, enabling the user to log out of all clients at once.

Identity-based access control

In this chapter the author introduces the notion of users, groups, RBAC (Role-Based Access Control) and ABAC (Access-Based Access Control). For each type of access control the author propose an ad-hoc implementation (no specific framework is used) for the Natter API (which is the API used all over the book to present different security controls.)

Capability-based security and macaroons

A capability is an unforgeable reference to an object or resource together with a set of permissions to access that resource. Compared with the more dominant identity-based access control techniques like RBAC and ABAC capabilities have several differences:

  • Access to resources is via unforgeable references to those objects that also grant authority to access that resource. In an identity-based system, anybody can attempt to access a resource, but they might be denied access depending on who they are. In a capability-based system, it is impossible to send a request to a resource if you do not have a capability to access it.
  • Capabilities provide fine-grained access to individual resources.
  • The ability to easily share capabilities can make it harder to determine who has access to which resources via your API.
  • Some capability-based systems do not support revoking capabilities after they have been granted. When revocation is supported, revoking a widely shared capability may deny access to more people than was intended.

The way to use capability-based security in the context of a REST API is via capabilities URIs. A capability URI (or capability URL) is a URI that both identifies a resource and conveys a set of permissions to access that resource. Typically, a capability URI encodes an unguessable token into some part of the URI structure. To create a capability URI, you can combine a normal URI with a security token.

The author adds the capability URI to the Netter API and implements this with the token encoded
into the query parameter because this is simple to implement. To mitigate any threat from tokens leaking in log files, a short-lived tokens are used.

But putting the token representing the capability in the URI path or query parameters is less than ideal because these can leak in audit logs, Referer headers, and through the browser history. These risks are limited when capability URIs are used in an API but can be a real problem when these URIs are directly exposed to users in a web browser client.

One approach to this problem is to put the token in a part of the URI that is not usually sent to the server or included in Referer headers.

The capacities URIs can be also be mixed with identity for handling authentication and authorization.There are a few ways to communicate identity in a capability-based system:

  • Associate a username and other identity claims with each capability token. The permissions in the token are still what grants access, but the token additionally authenticates identity claims about the user that can be used for audit logging or additional access checks. The major downside of this approach is that sharing a capability URI lets the recipient impersonate you whenever they make calls to the API using that capability.
  • Use a traditional authentication mechanism, such as a session cookie, to identify the user in addition to requiring a capability token. The cookie would no longer be used to authorize API calls but would instead be used to identify the user for audit logging or for additional checks. Because the cookie is no longer used for access control, it is less sensitive and so can be a long-lived persistent cookie, reducing the need for the user to frequently log in

The last part of the chapter is about macaroons which is a technology invented by Google (https://research.google/pubs/pub41892/). The macaroons are extending the capabilities based security by adding more granularity.

A macaroon is a type of cryptographic token that can be used to represent capabilities and other authorization grants. Anybody can append new caveats to a macaroon that restrict how it can be used

For example is possible to add new capabilities that allows only read access to a message created after a specific date. This new added extensions are called caveats.

Part 4: Microservice APIs in Kubernetes

Microservice APIs in K8S

This chapter is an introduction to Kubernetes orchestrator. The introduction is very basic but if you are interested in something more complete then Kubernetes in Action, Second Edition is the best option. The author also is deploying on K8S a (H2) database, the Natter API (used as demo through the entire book) and a new API called Linked-Preview service; as K8S “cluster” the Minikube is used.

Having an application with multiple components is helping him to show how to secure communication between these components and how to secure incoming (outside) requests. The presented solution for securing the communication is based on the service mesh idea and K8s network policies.

A service mesh works by installing lightweight proxies as sidecar containers into every pod in your network. These proxies intercept all network requests coming into the pod (acting as a reverse proxy) and all requests going out of the pod.

Securing service-to-service APIs

The goal of this chapter is to apply the authentication and authorization techniques already presented in previous chapters but in the context of service-to-service APIs. For the authentication the API’s keys, the JWT are presented. To complement the authentication scheme, the mutual TLS authentication is also used.

For the authorization the OAuth2 is presented. A more flexible alternative is to create and use service accounts which act like regular user accounts but are intended for use by services. Service accounts should be protected with strong authentication mechanisms because they often have elevated privileges compared to normal accounts.

The last part of the chapter is about managing service credentials in the context of K8s. Kubernetes includes a simple method for distributing credentials to services, but it is not very secure (the secrets are Base64 encoded and can be leaked by cluster administrator).

Secret vaults and key management services provide better security but need an initial credential to access. Using secret vaults have the following benefits:

  • The storage of the secrets is encrypted by default, providing better protection of secret data at rest.
  • The secret management service can automatically generate and update secrets regularly (secret rotation).
  • Fine-grained access controls can be applied, ensuring that services only have access to the credentials they need.
  • The access to secrets can be logged, leaving an audit trail.

Part 5: APIs for the Internet of Things

Securing IoT communications

This chapter is treating how different IoT devices could communicate securely with an API running on a classical system. The IoT devices, compared with classical computer systems have a few constraints:

  • An IOT device has significantly reduced CPU power, memory, connectivity, or energy availability compared to a server or traditional API client machine.
  • For efficiency, devices often use compact binary formats and low-level networking based on UDP rather than high-level TCP-based protocols such as HTTP and TLS.
  • Some commonly used cryptographic algorithms are difficult to implement securely or efficiently on devices due to hardware constraints or threats from physical attackers.

In order to cope with this constraints new protocols have been created based on the existing protocols and standards:

  • Datagram Transport Layer Security (DTLS). DTLS is a version of TLS designed to work with connectionless UDP-based protocols rather than TCP based ones. It provides the same protections as TLS, except that packets may be reordered or replayed without detection.
  • JOSE (JSON Object Signing and Encryption) standards. For IoT applications, JSON is often replaced by more efficient binary encodings that make better use of constrained memory and network bandwidth and that have compact software implementations.
  • COSE (CBOR Object Signing and Encryption) provides encryption and digital signature capabilities for CBOR and is loosely based on JOSE.

In the case when the devices needs to use public key cryptography then the key distribution became a complex problem. This problem could be solved by generating random keys during manufacturing of the IOT device (device-specific keys will be derived from a master key and some device-specific information) or through the use of key distribution servers.

Securing IoT APIs

The last chapter of the book is focusing on how to secure access to APIs in Internet of Things (IoT) environments meaning APIs provided by the devices or cloud APIs which are consumed by devices itself.

For the authentication part, the IoT devices could be identified using credentials associated with a device profile. These credentials could be an encrypted pre-shared key or a certificate containing a public key for the device.

For the authorization part, the IoT devices could use the OAuth2 for IoTwhich is a new specification that adapts the OAuth2 specification for constrained environments .

Book Review: Clean Architecture

This is the review of the Clean Architecture (A Craftsman’s Guide to Software Structure and Design) book.

(My) Conclusion

I personally have mixed feelings about this book; the first 4 parts of the book that presents the paradigms and different design principles are quite good (for me it contains all the theory that you need in order to tackle the IT architectural problems). You start reading from the first chapter and gradually you build knowledge on top of previous chapter/s.

On the other side, the part 5 and 6 of the book (which are representing the backbone of the book) have a different cognitive structure; the chapters are not really linked together, you cannot read and build on top of previous chapter/s because there is no coherency between chapters (some of the chapters are extended versions of blog tickets from https://8thlight.com/blog/).

The book explains very well the rules and patterns to apply in order to build an application easy to extend and test but the subjects like the scalability, availability and security that are qualities that an (every) application should have, are not treated at all.

Part I Introduction

The author tries to express the fact that good software design and (good) software architecture are intimately linked and that is very important to invest time and resources in having a good software design even if it looks like the project it advances slower.

The quality of the (software) design will influence the overall quality of the software product and to prove this the author comes with some figures/numbers (unfortunately there are reference to the source of this figures).

Part II Starting with the bricks: Programming Paradigms

The following programming paradigms are explained: Structured ProgrammingObject Oriented Programming and Functional Programming.

For each paradigm a brief history is done and also the author expresses how each paradigm characteristics can help and impact the software architecture.

  • the immutability characteristic of Functional Programming can help to simplify the design in respect of concurrency issues.
  • the polymorhism characteristic of Object Oriented Programming  can help the design to not care about the implementation details of the used components.
  • the Structured Programming helped us to decompose a (big) problem in smaller problems that can be then handled independently.

Part III Design Principles

This part is about the SOLID design principles; each one of these design principles are clearly explained using sometimes UML diagrams. The solid design principles are (usually) applied by software developers to write clean(er) code but  the author also explains how these principles can be applied to an architecture level:

  • SRP (Single Responsabilty Principle)  for a software developer is “A class should have only one reason to change.” but for an architect became “A module should be responsible to one, and only one author”.
  • OCP (Open-Closed principle) is translated in architectural terms by replacing the classes with high level components the goal being to arrange those components into a hierarchy that protects higher-level components from changes in lower-level components.
  • LSP (Liskov Substitution Principle) is translated in architectural terms by extending the interface concept from a programming language structure to gateways that different system components are using to communicate. The violation of substitutability of these gateways (interfaces) are causing the system architecture to be poluted.
  • ISP (Interface Segregation Principle) is translated in architectural terms by stating that generally is harmful that your systems depends on frameworks that has more features that you need.
  • DIP (Dependency Inversion Principle) is used to create architectural boundaries between different system components.

Part IV Component Principles

The components principles are categorized in two types: (component) cohesion and coupling.

The component cohesion principles are :

  • (REP) The Reuse/Release Equivalence Principle : This principle states that “The unit of reuse is the unit of release”. Classes and modules that are formed into a component must belong to a cohesive group and should be released together.
  • (CCP) The Common Closure Principle: This principle is actually the Single Responsibility Principle for components. The principle states that should gather into same component classes that changes for the same reason at the same time.
  • (CRP) The Common Reuse Principle: This principle states that “should not depend on things that you don’t need it”. This principle rather tell which classes should not be put together in the same module; classes that are not tightly bound to each other should not be in the sane component.

This principles are linked together and applying them could be contradictory. The following diagram express this contradiction; each edge express the cost hat it must be payed to abandon the principle for the opposite vertex.

The component coupling principles are:

  • (ADP) The Acyclic Dependencies Principle: The principle states that should have no cycle into the component dependency graph, the dependency graph should be a DAG (Directed Acyclic Graph). Solutions to eliminate dependencies cycles are: apply the Dependencies Injection Principle (DIP) or create a new component that will contain the classes that other components are depending on.
  • (SDP) The Stable Dependencies Principle: This principle states that modules that are intended to be easy to change should not be dependent on by modules that are harder to change. The component stability metric, called I (for instability) is computed in the following way: I = Incoming dependencies / (Incoming dependencies + Outgoing dependencies). So SDP can be restated as :the  I metric of a component should be larger than the I metric of the components that it depends on, a component should depend on more stable components only.
  • (SAP)The Stable Abstraction Principle: For this principle, the author introduces a new metric called abstractness which is defined as follow: A = Number of classes in the component / Number of abstract classes and interfaces in the component.  A value of 0 implies that the component have no abstract classes, a value of 1 implies that the component contains only abstract classes. The SAP principle sets up a relationship between stability (I) and abstractness (A) that have the form of a graph:

Part V Architecture

This part of the book is made of 14 chapters (almost 120 pages) and treats different aspects of a good architecture: how to define appropriate boundaries and layers (“Boundary Anatomy” chapter, “Partial Boundaries” chapter, “Layers and Boundaries” chapter, “The Test Boundary”), how to make a system that is easy to understand, develop (“The Clean Architecture” chapter, “Presenters and Humble Objects” chapter), maintain and deploy, how to organize components and services (“Screaming Architecture” chapter).

It would be very difficult to resume 120 pages in few phrases but the most important take-away would be the characteristics of a system produced by a good architecture:

  • independent of any frameworks – must see the (technical) frameworks as tools and the architecture should not depend of this frameworks (“Screaming Architecture” chapter develops and argued more about this topic).
  • testable – the business rules of the system should be testable without any external element.
  • independent of the UI – the UI can change without affecting the use cases of the system.
  • independent of the database – the business rules/ use cases should not be bounded to any database.

Clean Architecture

The golden rule for a clean architecture is: Source code dependencies must point only inward toward higher-level policies; any item from a circle should know nothing about the items from outer circle/s. (see the following image).

For more information for the earlier concept of Clean architecture you can check the Uncle Bob initial blog post: The Clean Architecture.

Part VI Details

The last part of the book tries to explain why some of the (technological) items used in it projects like the database, the UI technology or (technical) frameworks should not influence/contaminate the system architecture and it should always be positioned at the outer circle (see the previous image). This part also has a case study on which some of the rules and thoughts about architecture are put together and applied.

 

5 (software) security books that every (software) developer should read

I must admit that the title is a little bit catchy; a better title would have been “5 software security books that every developer should be aware of“. Depending on your interest you might want to read entirely these books or you could just know that they exists. There must be tons of software security books on the market but this is my short list of books about software security that I think that each developer that is interested in software security should be aware of.

Hacking – the art of exploitation This book explains the basics of different hacking techniques, especially the non-web hacking techniques: how to find vulnerabilities (and defend against)  like buffer overflow or stack-based buffer overflow , how to write shellcodes, some basic concepts on cryptography and attacks linked to the cryptography like the man-in-the-middle attack of an SSL connection. The author tried to make the text easy for non-technical peoples but some programming experience is required (ideally C/C++) in order to get the best of this book. You can see my full review of the book here.

Iron-Clad Java: Building secure web applications This book presents the hacking techniques and the countermeasures for the web applications; you can see this books as complementary of the previous one; the first one contains the non-web hacking techniques, this one contains (only) web hacking techniques; XSS, CSRF, how to protect data at rest, SQL injection and other types of injections attacks. In order to get the most of the book some Java knowledge is required. You can see my full review of the book here.

Software Security-Building security in  This books explains how to introduce the security into the SDLC; how to introduce abuse cases and security requirements in the requirements phase, how to introduce risk analysis (also known as Threat Modeling) in the design phase and software qualification phase. I really think that each software developer should at least read the first chapter of the book where the authors explains why the old way of securing application (seeing the software applications as “black boxes” than can be protected using firewalls and IDS/IPS) it cannot work anymore in the today software landscape. You can see my full review of the book here: Part 1, Part 2 and Part 3.

The Tangled Web: A Guide to Securing Modern Web Applications This is another technical book about security on which you will not see a single line of code (the Software Security-Building security in is another one) but it highly instructive especially if you are a web developer. The book presents all the “bricks” of the today Internet: HTTP, WWW, HTML, Cookies, Scripting languages, how these bricks are implemented in different browsers and especially how the browsers are implementing the security mechanism against rogue applications. You can see my full review of the book here.

Threat modeling – designing for security Threat modeling techniques (also known as Architectural Risk Analysis) were around for some time but what it has changed in the last years is the accessibility of these technique for the software developers.  This book is one of the reasons for which the threat modeling is accessible to the developers. The book is very dense but it  suppose that you have no knowledge about the subject. If you are interested in the threat modeling topic you can check this ticket: threat modeling for mere mortals.

Book review : The Tangled Web: A Guide to Securing Modern Web Applications

This is a review of the The Tangled Web: A Guide to Securing Modern Web Applications book.tangledwebbook

(My) Conclusion

This books makes a great job explaining how the “bricks” of the Internet (HTTP, HTML, WWW, Cookies, Script Languages) are working (or not) from the security point of view. Also a very systematic coverage of the browser (in)security is done even if some of the information it starts to be outdated. The book audience is for web developers that are interested in inner workings of the browsers in order to write more secure code.

Chapter 1 Security in the world of web applications

This goal of this chapter is to set the scene for the rest of book. The main ideas are around the fact that security is non-algorithmic problem and the best ways to tackle security problems are very empirical (learning from mistakes, develop tools to detect and correct problems, and plan to have everything compromised).

Another part of the chapter is dedicated to the history of the web because for the author is very important to understand the history behind the well known “bricks” of the Internet (HTTP, HTML, WWW) in order to understand why they are completely broken from the security point of view. For a long time the Internet standards evolutions were dominated by vendors or stakeholders who did not care much about the long-term prospects of technology; see the Wikipedia Browser Wars page for a few examples.

Part I Anatomy of the web (Chapters 2 to 8)

The first part of the book is about the buildings blocks of the web: the HTTP protocol, the HTML language, the CSS, the scripting languages (JavaScript, VBScript) and the external browser plug-ins (Flash, SilverLight). For each of these building blocks, the author presents how are implemented and how are working (or not) in different browsers, what are the standards that supposed to drive the development and how these standards are very often incomplete or oblivious of security requirements.

In this part of the book the author speaks only briefly about the security features, knowing that the second part if the book supposed to be focused on security.

Part II Browser security features (Chapter 9 to 15)

The first security feature presented is the SOP (Same Policy Origin), which is also the most important mechanism to protect against hostile applications. The SOP behaviour is presented for the DOM documents, for XMLHttpRequest, for WebStorage and how the security policies for cookies could also impact the SOP.

A less known topic that is treated is the SOP inheritance; how the SOP is applied to pseudo-urls like about:, javascript: and data:. The conclusion is that each browser are treating the SOP inheritance on different ways (which can be incompatible) and it is preferable to create new frames or windows by pointing them to a server-suplied blank page with a definite origin.

Another less known browser features (that can affect the security) are deeply explained; the way the browsers are recognizing the content of the response (a.k.a content sniffing), the navigation to sensitive URI schemes like “javascript:”, “vbscript:”, “file:”, “about:”, “res:” and the way the browsers are protecting itself against rogue scripts (in the case of the rogue scripts protection the author is pointing the inefficiently  of the protections).

The last part is about different mechanisms that browsers are using in order to give special privileges to some specific web sites; the explained mechanisms are the form-based password managers, the hard-coded domain names and the Internet Explorer Zone model.

Part III Glimpse of things to come (Chapter 16 to 17)

This part is about the developments done by the industry to enhance the security of the browsers,

For the author there are two ways that the browser security could evolve; extend the existing frameworks/s or try to restrict the existing framework/s by creating new boundaries on the top of existing browser security model.

For the first alternative, the following solutions are presented: the W3C Cross-Origin Resource Sharing specification , the Microsoft response to CORS called XDomainRequest  (which by the way was deprecated by Microsoft) and W3C Uniform Messaging Policy.

For the second alternative the following solutions are presented: W3C (former Mozilla) Content Security Policy , (WebKit) Sandboxed frames and Strict Transport Security.

The last part is about how the new planned APIs and features could have impact on the browser and applications security. Very briefly are explained the “Binary HTTP”, WebSocket (which was not yet a standard when the book was written), JavaScript offline applications, P2P networking.

Chapter 18 Common web vulnerabilities

The last chapter is a nomenclature of different known vulnerabilities grouped by the place where it can happen (server side, client side). For each item a brief definition is done and links are provided towards previous chapters where the item has been discussed.

Book review : Practical Anonymity: Hiding in Plain Sight Online

This is a review of the Practical Anonymity: Hiding in Plain Sight Online book.

Conclusion

This is not a technical book about the inner workings of Tor or Tails and I think a better title would be “How to use Tor and Tails for dummies”. Almost all the information present in the book can be found in the official documentation, the only positive point is that all the needed information is present in one single place.

Chapter 1. Anonymity and Censorship Circumvention

This first chapter is an introduction to what is on-line anonymity, why (the on-line anonymity ) is important for some people and how it can be achieved using Tor. The chapter contains also the fundamentals of how Tor is working, what it can do to on-line anonymity and some advises about how it can be used safely.

Chapter 2. Using the Tor Browser Bundle

The chapter is presenting the TBB (Tor Browser Bundle) in detail. The TBB is composed of three components; Vidalia which is the control panel for Tor, the Tor software itself and Mozilla Firefox browser. Each of this tree components are described from the user point of view, each of the possible configuration options are presented in detail.

Chapter 3. Using Tails

Tails is a is a Linux distribution that includes Tor and other softwares to provide an operating system that enhances privacy. The Tails network stack has been modified so that all Internet connectivity is routed through the Tor network.

In order to enhance privacy, Tails is delivered with the following packages :

  • Firefox
  • Pidgin
  • GNU Privacy Guard
  • Metadata Anonymization Toolkit
  • Unsafe Web Browser

Detailed instructions are presented about how to create a bootable DVD and a bootable USB stick and how to run and configure the operating system. The persistent storage feature of Tails is presented in detail so that the reader can understand what are the benefits and the drawbacks.

Chapter 4. Tor Relays, Bridges and Obfsproxy

The chapter is about how the Tor adversaries can disrupt the network and how the Tor developers are trying to find new technique to workaround these disruptions.

One way to forbid to the user the access to the Tor network is to filter the (nine) Tor directory authorities, that are servers that distribute information about active Tor entry points. One way to avoid this restriction is the use of Tor bridge relays. A bridge relay is like any other Tor transit relay, the  only difference is that it is not publicly listed and it is used only for entering the Tor network from places where public Tor relays are blocked. There are different mechanisms to retrieve the list of this bridge relays, like a web page on the Tor website or emails sent by email.

Another way to disrupt the Tor network is to filter the Tor traffic knowing that the Tor protocol packages have a distinguished signature. One way to avoid the package filtering is to conceal the Tor packages in  other kind of packages. The framework that can be used to implement this kind of functionality is called Obfsproxy (obfuscated proxy). Some of the plug-ins that are using Pbfsproxy: StegoTorus, Dust, SkypeMorph.

Chapter 5. Sharing Tor Resources

This chapter describes how a user can share his bandwidth becoming a Tor bridge relay, a Tor transit relay or a Tor exit relay. Detailed settings descriptions are made for each type of relays and also the incurred risks for the user.

Chapter 6. Tor Hidden Services

A (Tor) hidden service is a server that can be accessed by other clients within the anonymity network, while the actual location (IP address) of the server remains anonymous. The hidden service protocol is briefly presented followed by how to set up a hidden service. For the set up a hidden service the main takeaways are:

  • install Tor and the service that you need on a VM.
  • run the VM on a VPS (virtoual private server)  hosted  in a country having privacy-friendly legislation in place.
  • the VM can/should be encrypted, be power cycled and that has no way to know what IP address or domain name of the computer on which it is running.

Chapter 7. Email Security and Anonymity practices.

This last chapter is about the email anonymity in general and how the use of Tor can improve the email anonymity. The main takeaways :

  • choose a email provider that do not require another email address or a mobile phone.
  • choose an email provider that supports HTTPS.
  • encrypt the content of your emails.
  • register and connect to the email box using ALWAYS Tor.