Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Effective Java Programming Language Guide - Bloch J..pdf
Скачиваний:
41
Добавлен:
24.05.2014
Размер:
2.93 Mб
Скачать

Effective Java: Programming Language Guide

Chapter 9. Threads

Threads allow multiple activities to proceed concurrently in the same program. Multithreaded programming is more difficult than single-threaded programming, so the advice of Item 30 is particularly applicable here: If there is a library class that can save you from doing low-level multithreaded programming, by all means use it. The java.util.Timer class is one example, and Doug Lea's util.concurrent package[Lea01] is a whole collection of high-level threading utilities. Even if you use such libraries where applicable, you'll still have to write or maintain multithreaded code from time to time. This chapter contains advice to help you write clear, correct, well-documented multithreaded programs.

Item 48: Synchronize access to shared mutable data

The synchronized keyword ensures that only a single thread will execute a statement or block at a time. Many programmers think of synchronization solely as a means of mutual exclusion, to prevent an object from being observed in an inconsistent state while it is being modified by another thread. In this view, an object is created in a consistent state (Item 13) and locked by the methods that access it. These methods observe the state and optionally cause a state transition, transforming the object from one consistent state to another. Proper use of synchronization guarantees that no method will ever observe the object in an inconsistent state.

This view is correct, but it doesn't tell the whole story. Not only does synchronization prevent a thread from observing an object in an inconsistent state, but it also ensures that objects progress from consistent state to consistent state by an orderly sequence of state transitions that appear to execute sequentially. Every thread entering a synchronized method or block sees the effects of all previous state transitions controlled by the same lock. After a thread exits the synchronized region, any thread that enters a region synchronized by the same lock sees the state transition caused by that thread, if any.

The language guarantees that reading or writing a single variable is atomic unless the variable is of type long or double. In other words, reading a variable other than a long or double is guaranteed to return a value that was stored into that variable by some thread, even if multiple threads modify the variable concurrently without synchronization.

You may hear it said that to improve performance, you should avoid the use of synchronization when reading or writing atomic data. This advice is dangerously wrong. While the atomicity guarantee ensures that a thread will not see a random value when reading atomic data, it does not guarantee that a value written by one thread will be visible to another: Synchronization is required for reliable communication between threads as well as for mutual exclusion. This is a consequence of a fairly technical aspect of the Java programming language known as the memory model [JLS, 17]. While the memory model is likely to undergo substantial revision in an upcoming release [Pugh01a], it is a near certainty that this fact will not change.

The consequences of failing to synchronize access to a shared variable can be dire even if the variable is atomically readable and writable. Consider the following serial number generation facility:

141

Effective Java: Programming Language Guide

// Broken - requires synchronization! private static int nextSerialNumber = 0;

public static int generateSerialNumber() { return nextSerialNumber++;

}

The intent of this facility is to guarantee that every invocation of generateSerialNumber returns a different serial number, as long as there are no more than 232 invocations. Synchronization is not required to protect the invariants of the serial number generator because it has none; its state consists of a single atomically writable field (nextSerialNumber), and all possible values of this field are legal. However, the method does not work without synchronization. The increment operator (++) both reads and writes the nextSerialNumber field so it is not atomic. The read and write are independent operations, performed in sequence. Multiple concurrent threads can thus observe the nextSerialNumber field with the same value and return the same serial number.

More surprisingly, it is possible for one thread to call generateSerialNumber repeatedly, obtaining a sequence of serial numbers from zero to n, after which another thread calls generateSerialNumber and obtains a serial number of zero. Without synchronization, the second thread might see none of the updates made by the first. This is a result of the aforementioned memory model issue.

Fixing the generateSerialNumber method is as simple as adding the synchronized modifier to its declaration. This ensures that multiple invocations won't be interleaved and that each invocation will see the effects of all previous invocations. To bulletproof the method, it might also be wise to use long instead of int or to throw an exception if nextSerialNumber were about to wrap.

Next, consider the process of stopping a thread. While the platform provides methods for involuntarily stopping a thread, these methods are deprecated because they are inherently unsafe—their use can result in object corruption. The recommended method of stopping a thread is simply to have the thread poll some field whose value can be changed to indicate that the thread is to stop itself. The field is typically a boolean or an object reference. Because reading and writing such a field is atomic, some programmers are tempted to dispense with synchronization when accessing the field. Thus it is not uncommon to see code that looks like this:

// Broken - requires synchronization!

public class StoppableThread extends Thread { private boolean stopRequested = false;

public void run() { boolean done = false;

while (!stopRequested && !done) {

... // do what needs to be done.

}

}

public void requestStop() { stopRequested = true;

}

}

142

Effective Java: Programming Language Guide

The problem with this code is that in the absence of synchronization, there is no guarantee as to when, if ever, the stoppable thread will “see” a change in the the value of stopRequested that was made by another thread. As a result, the requestStop method might be completely ineffective. Unless you are running on a multiprocessor, you are unlikely to observe the problematic behavior in practice, but there are no guarantees. The straightforward way to fix the problem is simply to synchronize all access to the stopRequested field:

// Properly synchronized cooperative thread termination public class StoppableThread extends Thread {

private boolean stopRequested = false;

public void run() { boolean done = false;

while (!stopRequested() && !done) {

... // do what needs to be done.

}

}

public synchronized void requestStop() { stopRequested = true;

}

private synchronized boolean stopRequested() { return stopRequested;

}

}

Note that the actions of each of the synchronized methods are atomic: The synchronization is being used solely for its communication effects, not for mutual exclusion. It is clear that the revised code works, and the cost of synchronizing on each iteration of the loop is unlikely to be noticeable. That said, there is a correct alternative that is slightly less verbose and whose performance may be slightly better. The synchronization may be omitted if stopRequested is declared volatile. The volatile modifier guarantees that any thread that reads a field will see the most recently written value.

The penalty for failing to synchronize access to stopRequested in the previous example is comparatively minor; the effect of the requestStop method may be delayed indefinitely. The penalty for failing to synchronize access to mutable shared data can be much more severe. Consider the double-check idiom for lazy initialization:

// The double-check idiom for lazy initialization - broken! private static Foo foo = null;

public static Foo getFoo() { if (foo == null) {

synchronized (Foo.class) { if (foo == null)

foo = new Foo();

}

}

return foo;

}

143

Effective Java: Programming Language Guide

The idea behind this idiom is that you can avoid the cost of synchronization in the common case of accessing the field (foo) after it has been initialized. Synchronization is used only to prevent multiple threads from initializing the field. The idiom does guarantee that the field will be initialized at most once and that all threads invoking getFoo will get the correct value for the object reference. Unfortunately, the object reference is not guaranteed to work properly. If a thread reads the reference without synchronization and then invokes a method on the referenced object, the method may observe the object in a partially initialized state and fail catastrophically.

That a thread can observe the lazily constructed object in a partially initialized state is wildly counterintuitive. The object is fully constructed before the reference is “published” in the field from which it is read by other threads (foo). But in the absence of synchronization, reading a “published” object reference does not guarantee that a thread will see all of the data that were stored in memory prior to the publication of the object reference. In particular, reading a published object reference does not guarantee that the reading thread will see the most recent values of the data that constitute the internals of the referenced object. In general, the doublecheck idiom does not work, although it does work if the shared variable contains a primitive value rather than an object reference [Pugh01b].

There are several ways to fix the problem. The easiest way is to dispense with lazy initialization entirely:

// Normal static initialization (not lazy) private static final Foo foo = new Foo();

public static Foo getFoo() { return foo;

}

This clearly works, and the getFoo method is as fast as it could possibly be. It does no synchronization and no computation either. As discussed in Item 37, you should write simple, clear, correct programs, leaving optimization till last, and you should optimize only if measurement shows that it is necessary. Therefore dispensing with lazy initialization is generally the best approach. If you dispense with lazy initialization, measure the cost, and find that it is prohibitive, the next best thing is to use a properly synchronized method to perform lazy initialization:

// Properly synchronized lazy initialization private static Foo foo = null;

public static synchronized Foo getFoo() { if (foo == null)

foo = new Foo(); return foo;

}

This method is guaranteed to work, but it incurs the cost of synchronization on every invocation. On modern JVM implementations, this cost is relatively small. However, if you've determined by measuring the performance of your system that you can afford neither the cost of normal initialization nor the cost of synchronizing every access, there is another option. The initialize-on-demand holder class idiom is appropriate for use when a static field is

144