Java Concurrency in Practice by Joshua Bloch

February 23, 2015

20 min read

Summary of chapters 1-5
Chapter 1 Introduction
Chapter 2 Thread safety
Chapter 3 Sharing objects
Chapter 4 Composing objects
Chapter 5 Building blocks
Chapter 6 Task Execution
- 6.1 Executing tasks in threads
- 6.2 Executor Framework

Summary of chapters 1-5

Less mutable state, easier it is ensure thread safety. Make fields final unless they need to be mutable
Immutable objects are thread-safe. They can be shared freely without locking or copying
Guard each mutable variable with a lock
Guard all variables in an invariant with the same lock
Hold locks for the duration of compound actions
A program that accesses a mutable variable from multiple threads without synchronization is a broken program.
Don't rely on clever reasoning on why you don't need to synchronize
Document your sychnronization policy.

Chapter 1 Introduction

Operating systems provide a way to run multiple programs as isolated processes. The OS provides memory, file handlers, security credentials. Processes communicate with each other using semaphores, shared memory, signals, files and sockets. Having the OS do this is a good idea because

better resource consumption. Other programs can run when one is blocked on IO
Its fair to other users who want to execute something
its more convenient to have multiple programs that specialise instead of having one program that does everything.

Threads are the basic unit of scheduling. They execute simultaneously and asynchronously with respect to each other. The threads of one program can access the same variables and allocate objects to the same heap. Without explicit synchronization, unpredictable things will happen.

Threads allow the user to take advantage of multiple processors
It is easier to model a system, because it can be made of multiple parts, each of which is synchronous.
simplified handling of asynchronous events. One-thread-per-client makes things simple.
better GUI handling

BUT

Safety hazards. The compiler, hardware and runtime are allowed leeway to reorder instructions, cache variables in registers or processor-local caches where they are invisible to other threads. These tricks improve performance, but introduce safety hazards. Race conditions occur becaue we don't know what order instructions may be executed in.
Liveness hazards. Deadlock, livelock and starvation. Covered in chapter 10
Performance hazards

Chapter 2 Thread safety

An object's state is stored in its instance and static variables. These need to be protected from concurrent access. 3 methods for doing so

not sharing state variables across threads (how?)
making them immutable
make the methods that access them synchronized When designing classes, 1. Encapsulation, 2. Clear specification of invariants and 3. Immutability are important.

2.1 Thread safety

A class is thread safe when it behaves correctly when accessed by multiple threads, regardless of scheduling or interleaving of those threads by the runtime and with no additional sychronization/coordination on the part of the calling code. Correctness of a program broadly means that its invariants and post conditions are not violated.

It is possible to have a thread safe program that is made of thread unsafe classes and vice versa. Thread safety is a term applied to code but it is about state. It can be applied to a body of code that encapsulates its state, whether that's a class or a program.

A class that does not have state is thread safe. A method that does not access the class' state and only allocates variables on the thread's stack is thread safe.

2.2 Atomicity

A data race is not the same as a race condition. The existence of one does not necessarily mean the existence of the other. Many race conditions are due to data races and many data races lead to race conditions A data race occurs when 2 threads access the same memory location at the same time without synchronization, one of which does a write operation. This is a data race and can be detected automatically.

Example from http://blog.regehr.org/archives/490. This is pseudocode.

Data race & Race condition

transfer1(amount, account_from, account_to) {
	if (account_to.balance &lt; amount) return NOPE
	account_from += amount
	account_to -= amount
	return YES
}

Race condition

A key invariant (conservation of money) is broken

transfer2(amount, account_from, account_to) {
	atomic {
		if (account_to.balance &lt; amount) return NOPE
	}
	atomic {
		account_from += amount
	}
	atomic {
		account_to -= amount
	}
	return YES
}

Neither

transfer3(amount, account_from, account_to) {
	atomic {
		if (account_to.balance &lt; amount) return NOPE
		account_from += amount
		account_to -= amount
		return YES
	}
}

Data Race

transfer4(amount, account_from, account_to) {
	account_from.activity = true;
	account_to.activity = true;
	atomic {
		if (account_to.balance &lt; amount) return NOPE
		account_from += amount
		account_to -= amount
		return YES
	}
}

Important to note that examples 2 and 4 don't contain data races but are still awful.

2.3 Locking

If your object's state is expressed in terms of threadsafe objects like AtomicLong, it is easier to reason about and verify thread safety. To preserve state consistency, update related state variables in a single atomic operation.

A synchronized block has 2 parts - a reference to an object that will serve as the lock and the block of code that it guards. A thread that requests a lock that it already owns is automatically granted it (reentrancy)

Instrinsic locks - every object can act as a lock for the purpose of synchronization.

2.4 Guarding state with locks

The combination of 2 synchronized methods is not necessarily atomic.

Locks are necessary for all accesses, not just writes.

Every shared, mutable variable must be guarded by exactly one lock. For each invariant that depends on more than one variable, all the variables must be guarded by the same lock.

Too much synchronization leads to performance issues. A balance is needed between simplicity (synchronizing an entire method) and performance (synchronizing select lines). Simplicity should not be sacrificed early on.

Avoid holding locks while doing operations that might take a long while.

Synchronization is about more than atomicity or simply demarcating critical areas ("mutual exclusion")

Memory visibility - when one thread mutates the state of an object, the other threads should be able to see the changes that were made. This is the key to ensure that objects can be shared and published in such a way that multiple threads can access them.

Without adequate synchronization, the thread reading the state might receive stale data.

Data that has not been set will default to some value that was placed there by some thread. This safety is called out-of-thin-air safety. non-volatile double and long don't have this safety, but everything else does.

3.1 Volatile

Variables that are declared volatile are treated differently by the compiler and runtime. They are not put in registers or caches (and thus invisible from other threads) so a read always returns the most recent value written to it. Also, operations on it are not reordered. Threads don't block on it though, so its a weaker form of synchronization.

Writing to a volatile variable is like exiting a synchronized block, and reading from it is like entering one. Using this property to write concurrent programs is not recommended if the code requires subtle reasoning to verify correctness. Use it when its straightforward. The following can be done with synchronized also but volatile makes more sense.

volatile boolean asleep;
while (!alseep) {
	doSomething();
}

volatile provides memory visibility. synchronized provides atomicity and memory visibility. You can use volatile in the following cases

Writing the variable does not depend on its current state
It does not participate in any other invariants
Locking is not required for any other reason while the variable is being accessed.

3.2 Publication

Allowing variables to be accessed from outside is said to be publishing. If its published prematurely or incorrectly, its said to be "escape".

A way for the this variable to escape is by attaching it to an EventSource. If it escapes before its constructor is finished, it is an incompletely constructed object. Another way is if a thread is started in the constructor. The thread might get the reference explicitly (through the constructor) or implicitly (because its an inner class) and starting it means that it would be alive and have a reference to this object that is not fully constructed. It should be started in an init method, not in the constructor.

3.3 Thread confinement

Limit the number of threads to 1. Voila, no synchronization issues. Example - the visual components and data object models in Swing are not thread safe, but they can't be accessed directly. Swing has one thread that can access them all and you can schedule Runnables on it.
Thread pools of thread-unsafe objects (say database Connection objects) become thread safe if they're allocated to only one thread at a time.
Stack confinement - local variables in a method are allocated on that thread's stack. Its not possible for multiple threads to access local primitive variables. References to local objects might leak.
ThreadLocal - makes a separate copy for each thread. Useful when each thread needs its own buffer or copy of a db connection.

They also mentioned "ad-hoc confinement"

3.4 Immutability

Immutable objects are thread safe - they can be used by any thread without additional synchronization, even when no synchronization is used to publish them. An object is immutable if all its fields are final, they are all intialised in the constructor and a reference to this does not escape during construction.

final is similar to C++ const. Declaring it as final guarantees initialisation safety that allows it to be shared and passed around. Variables should be private and final by default unless needed otherwise.

3.5 Safe publishing

An object can be published safely if

Its initialized in a static manner. (Static initializers are run by the JVM at class initialization time)
Its stored in an AtomicReference or volatile variable
or in a final field of a properly constructed object
or in a field properly guarded by a lock

Point 4 means that storing the object in a thread safe collection will take care of synchronization. If thread A stores an object in the collection and B retrieves it, B is guaranteed to see it as A left it. Examples :

Placing a key or value in Hashtable, SynchronizedMap, concurrentMap will allow any thread to retrieve it directly or via an iterator
vector, CopyOnWriteArrayList, CopyOnWriteArraySet, sychronizedList, synchronizedSet
BlockingQueue, ConcurrentLinkedQueue

When sharing a mutable object, its state must be guarded by a lock or it must be thread safe.

To recap, sharing between threads can be done if the object fulfills one of

Its state is thread confined
Its immutable or effectively immutable (shared read-only)
It performs synchronization internally (shared threadsafe)
Its guarded

Chapter 4 Composing objects

4.1 Designing a thread safe class

Identify the variables that define the object's state
Identify the invariants that constrain the variables
Establish a policy to allow access to the variables such that the invariants remain unchanged

To ensure thread safety, first understand the invariants and post conditions. The state variables may have only certain valid states or state transitions. This can create atomicity and encapsulation requirements when changing the variables.

Interesting - in some cases a thread needs to wait to perform an action - until a certain state is triggered. Another thread could change the variable needed for this. How to wait efficiently?

Who owns the object? It can be owned by one object or ownership can be shared (like in some collections classes)

4.2 Instance confinement

Encapsulating data within an object is done so that the data can only be accessed via the object's methods and in the right way - with appropriate locks etc. This makes it easier to check the (now few) code paths that access the state

Monitor pattern - all of the object's state is protected by the object's intrinsic lock.

It is possible to use a private lock (new Object()). The client code cannot acquire the lock this way. Clients can't participate in the synchronization policy - for better or worse

4.3 Delegating thread safety

Does an object composed of thread safe object require additional synchronization? It depends.

If an object is composed of thread safe objects and its invariants do not impose additional constraints on the state (ie, invalid states or state transitions), then no additional synchronization is needed and thread safety can be delegated to those objects.

4.4 Adding functionality to a thread safe class

4.5 Documenting synchronization policies

Document a module/object's thread safety guarantees for its users and its synchronization policy for its maintainers. Write it down before you forget :P. If you want to limit client side locking, say so. The client does not need to know implementation details, but future maintainers do

Chapter 5 Building blocks

5.1 Synchronized collections

These collections are thread safe but you might need extra locking to guard compound actions. Iteration, navigation and put-if-absent are all compound operations. In these cases, the collections object itself is what we lock on.

public static Object getLast(Vector list) {
    synchronized (list) {
        int lastIndex = list.size() - 1;
        return list.get(lastIndex);
    }
}
public static void deleteLast(Vector list) {
    synchronized (list) {
        int lastIndex = list.size() - 1;
        list.remove(lastIndex);
    }
}

This block might raise an ArrayIndexOutOfBoundsException if the vector has grown smaller. This does not mean the vector isn't thread safe, it is. The state of the vector is valid and its behaviour is in conformance with the spec.

for (int i = 0; i &lt; vector.size(); i++)
    doSomething(vector.get(i));

When you iterate over a collection and it is modified from another thread, it will raise a ConcurrentModificationException. This is because the iterator from for-each is fail fast, ie, it will just throw this if it detects the collection has been modified.

We cannot lock the whole collection while we are iterating over it. This will have big performance implications. A related solution is to make a copy of the collection when we want to iterate over it.

We must take care to use sychronization wherever we iterate the collection. In some cases the iterator is hidden. For example, logging a collection will call its toString() method which iterates over the collection and this may raise the dreaded exception.

5.2 Concurrent collections

Synchronized collections achieve their thread safety by serialising all access to the collection's state. This is poor concurrency. When multiple threads contend for the collection wide lock, performance suffers.

Replacing synchronized collections with their concurrent brethren can offer significant scalability improvements.

ConcurrentHashMap replaces Collections.synchronizedMap(). supports put􏰁if􏰁absent, replace, and conditional remove.

CopyOnWriteArrayList replaces Collections.synchronizedList(). Useful in cases where traversal is the dominant operation

ConcurrentSkipListMap and ConcurrentSkipListSet are concurrent replacements for a synchronized SortedMap or SortedSet (such as Tree Mapor Tree SetwrappedwithsynchronizedMap).

map.get() or list.contains() operations might be extremely complex. Traversing a hash bucket to find a particular object and then calling equals() is expensive. If the list is long, then this might take a long time. It is not feasible for one thread to hog access to the collection while this happens.

ConcurrentHashMap uses a different locking strategy that offers better concurrency and stability. It uses "lock striping" (see 11.4). Arbitrarily many threads can read and a limited number of writers can modify concurrently. Iterators on this do not throw ConcurrentModificationException. They are weakly consistent instead of fail-fast. It will continue iterating upon modification and may or may not reflect the changes that occurred since the construction of the iterator. size() and isEmpty() operate on the entire map, so their results are only an estimates. That's not a big deal. The important operations like put(), get(), containsKey() and delete() are optimised. You can no longer acquire a lock on the entire map (unlike Hashtable or synchronizedMap) so you cannot use client-side locking to create new atomic operations.

5.3 Blocking queues and producer consumer pattern

PC pattern is basically Go channels

LinkedBlockingQueue and ArrayBlockingQueue are FIFO queues like LinkedList and ArrayList but better concurrent performance than a synchronized list

SynchronousQueue isn't really a queue, its a collection of threads. When a consumer is ready, the producer hands over the value to it directly, lowering latency.

Deque ("deck") is used to implement a work-stealing algorithm. Each consumer has its own deque that it reads from. In case a consumer is done with its deque, it reads from the tail of some other consumer. There is minimal contention in this case.

5.4 Blocking and interruptable methods

When a thread blocks it is placed in BLOCKED, WAITING or TIMED_WAITING states. When the event it is waiting on occurs, its is placed back in RUNNABLE state and becomes eligible for scheduling.

When a method throws a checked InterruptedException, it means that it contains a blocking operation.

What to do with that exception? If you're a library, propagate it. If you're a Runnable and therefore have no control of the call stack, swallow it.

5.5 Synchronizers

Blocking queues are unique amongst collections in that they hold objects AND they coordinate the control flow of threads since put() and take() are blocking. Thus they are also synchronizers.

Synchronizers are objects that coordinate the control flow of threads based on their state. Examples - latch, semaphore, barrier

A latch is like a gate that all threads block on until some condition is met/event occurs.

A semaphore is like a pool of permits that are handed out to those that call acquire(). In case the pool is empty, that call blocks until a thread holding a permit calls release().

A barrier is like a latch, except that the event it is waiting for is that the specified number of threads reach block on the barrier.

FutureTask also acts as a latch. It is implemented with a Callable, the result bearing counterpart of Runnable. It can be in one of three states - waiting to run, running, or finished. If it is finished, get() return immediately. Else it blocks.

5.6 Building a scalable, efficient result cache.

Let's start with this:

public interface Computable&lt;A, V&gt; {
    V compute(A arg) throws InterruptedException;
}
public class ExpensiveFunction implements Computable&lt;String, BigInteger&gt; {
    public BigInteger compute(String arg) {
        // after deep thought...
        return new BigInteger(arg);
	} 
}
public class Memorizer1&lt;A, V&gt; implements Computable&lt;A, V&gt; {
    @GuardedBy(&quot;this&quot;)
    private final Map&lt;A, V&gt; cache = new HashMap&lt;A, V&gt;();
    private final Computable&lt;A, V&gt; c;
    public Memorizer1(Computable&lt;A, V&gt; c) {
        this.c = c;
	}
    public synchronized V compute(A arg) throws InterruptedException {
        V result = cache.get(arg);
        if (result == null) {
            result = c.compute(arg);
            cache.put(arg, result);
        }
        return result;
    }
}

Problem: Only one thread can call compute at any given time. That's not scalable at all.

Solution: Change compute() in Memorizer1 to not being synchronized. Instead, use a thread-safe ConcurrentHashMap.

Next problem: Multiple threads might start an expensive computation if one is already in progress but not in cache

Solution: Change the cache from Map<A, V> to Map<A, FutureTask> so that the second thread that asks will block on the get() of the Future in the cache until it is complete.

Next problem: Cache pollution

Solution: In the catch blocks of compute(), clear the cache of that FutureTask.

public class Memorizer&lt;A, V&gt; implements Computable&lt;A, V&gt; {
    private final ConcurrentMap&lt;A, Future&lt;V&gt;&gt; cache = new ConcurrentHashMap&lt;A, Future&lt;V&gt;&gt;();
    private final Computable&lt;A, V&gt; c;
    public Memorizer(Computable&lt;A, V&gt; c) { this.c = c; }
    
    public V compute(final A arg) throws InterruptedException {
        while (true) {
            Future&lt;V&gt; f = cache.get(arg);
            if (f == null) {
                Callable&lt;V&gt; eval = new Callable&lt;V&gt;() {
                    public V call() throws InterruptedException {
                        return c.compute(arg);
                    }
                };
                FutureTask&lt;V&gt; ft = new FutureTask&lt;V&gt;(eval);
                f = cache.putIfAbsent(arg, ft);
                if (f == null) { f = ft; ft.run(); }
			  } 
			  try {
                return f.get();
            } catch (CancellationException e) {
                cache.remove(arg, f);
            } catch (ExecutionException e) {
                throw launderThrowable(e.getCause());
            }
		} 
	}
}

Next problem: Cache expiry

Solution: left as an exercise for the reader :P

The cool thing here is that the cache implements the same interface Computable just like class doing the computation, so its transparent to the user of this class. Its used like:

@ThreadSafe
public class Factorizer implements Servlet {
    private final Computable&lt;BigInteger, BigInteger[]&gt; c =
        new Computable&lt;BigInteger, BigInteger[]&gt;() {
            public BigInteger[] compute(BigInteger arg) {
                return factor(arg);
		 } 
	 };
    private final Computable&lt;BigInteger, BigInteger[]&gt; cache = new Memorizer&lt;BigInteger, BigInteger[]&gt;(c);
    
    public void service(ServletRequest req, ServletResponse resp) {
        try {
            BigInteger i = extractFromRequest(req);
            encodeIntoResponse(resp, cache.compute(i));
        } catch (InterruptedException e) {
            encodeError(resp, &quot;factorization interrupted&quot;);
		 } 
	 }
}

Chapter 6 Task Execution

Your program should be broken into independent tasks. If the task boundaries are chosen well, then tasks will represent a small fraction of your application's processing capability and it will give greater flexibility in scheduling. This + a sensible task execution policy is required. Example of a task boundary - each user request.

6.1 Executing tasks in threads

class SingleThreadWebServer {
	public static void main(String[] args) throws IOException {
		ServerSocket socket = new ServerSocket(80);
		while (true) {
			Socket connection = socket.accept();
			handleRequest(connection);
		}
	} 
}

This performs poorly because the single thread can block on reading from or writing to the connection. When blocked it cannot service new requests and the CPU remains idle as well.

As an improvement, a new thread is created for each incoming request. handleRequest(connection) becomes

Runnable r = new Runnable() {
	@Override
	public void run() {
		handleRequest(connection);
	}
};
new Thread(r).start();

This leads to poor resource management. The server has no idea what the optimal number of threads is. Creating too many would use up too much memory, putting pressure on the gc. If there are enough threads to keep all CPUs busy, there might not be a benefit to creating more, only harm.

6.2 Executor Framework

This is the primary abstraction for task execution, not threads. It supports a variety of execution policies. The implementations of Executor also provide lifecycle support and hooks for adding statistics gathering, application management and monitoring.

Producers - Activities that submit tasks. Consumers - Threads which execute the tasks. Executor is the easiest way to implement the PC pattern.

public interface Executor {
	void execute(Runnable task);
}

The executor can be configured. If necessary, these questions can be answered based on hardware available. Separating this from the code that submits tasks is a good idea.

In what thread will the task be executed?
In what order will the tasks be executed?
How many tasks can run concurrently?
How many tasks can be queued?
If a task has to be discarded from the queue, which one is it?
What should be done before and after execution?

Thread pools - its a good idea because the cost of setup and teardown is amortized over several threads. The latency to get a thread to execute your task is lower. Number of threads should be high enough to keep all CPUs busy, but not so much that the program runs out of memory or thrashes due to competition between threads for resources. Examples of Executors based on thread pools

newFixedThreadPool - create threads as necessary till a limit is reached, and tries to stay at that limit.
newCachedThreadPool - kills threads that are idle and adds threads when necessary. No fixed limit.
newSingleThreadExecutor - single thread that executes tasks based on LIFO, FIFO or priority

ExecutorService extend Executor. A service can be running, shutting down and terminated. This allows for graceful shutdown of programs.

public interface ExecutorService extends Executor {
	void shutdown();
	List&lt;Runnable&gt; shutdownNow();
	boolean isTerminated;
	boolean isShutdown;
	boolean awaitTermination(long timeout, TimeUnit unit) throws InterruptedException;
	// convenience methods for submission of tasks
}

Don't use Timer. Just don't.