1. 程式人生 > >執行緒(集合的執行緒安全問題)

執行緒(集合的執行緒安全問題)

集合與執行緒安全

Do you notice that all the basic collection classes - ArrayList, LinkedList, HashMap, HashSet, TreeMap, TreeSet, etc - all are not synchronized? In fact, all collection classes (except Vector and Hashtable) in the java.util package are not thread-safe. The only two legacy collections are thread-safe: Vector and Hashtable.
大多數的集合類都不是執行緒安全的,除了早期設計的Vector和Hashtable外,因為加鎖是昂貴的代價。

it’s always wise to use thread-safe collections instead of writing synchronization code manually.
要保證及集合的執行緒安全,可以手動使用sychronized,但更推薦使用執行緒安全的集合
有工廠方法去建立這些集合:Collections.synchronizedXXX(collection)

但要注意的是,
when using the iterator of a synchronized collection we should use synchronized block to safeguard the iteration code because the iterator itself is not thread-safe.
當對同步集合使用迭代器時,我們應該使用同步程式碼塊去保障迭代器的線性安全,因為迭代器本身並不是執行緒安全的,就像下面這樣:
e.g.

synchronized (safeList) {
    while (iterator.hasNext()) {
        String next = iterator.next();
        System.out.println(next);
    }
}

Also note that the iterators of the synchronized collections are fail-fast.
對於同步集合的迭代器都是fail-fast型別的。

關於迭代器,有好幾種類型,下面的一節就介紹了不同的迭代器。

上面的叫synchronized wrappers

,但它們還是有缺點:
their synchronization mechanism uses the collection object itself as the lock object. That means when a thread is iterating over elements in a collection, all other collection’s method block, causing other threads having to wait.
它們的同步機制都是用集合本身作為鎖,這就意味著當一個執行緒對一個集合裡的元素進行便利,那麼該集合的所有其他方法都將阻塞,導致其他的執行緒需要等待。

於是,有了更優秀的改進,就是Concurrent Collections
它可以分為三大類:
1. copy-on-write collections
2. Compare-And-Swap collections(後面的小節有詳細解釋)
3. Using a special lock object

注意:
1、針對寫時複製的集合
1) 它store values in an immutable array(在不可變陣列中存值)
2) 所有add操作加鎖
3) copy-on-write collections have snapshot iterators which do not throw “ConcurrentModificationException”
2、
Note that the CAS collections have weakly consistent iterators, which reflect some but not necessarily all of the changes that have been made to their backing collection since they were created. Weakly consistent iterators do not throw ConcurrentModificationException.
3、
也是具有讀不一致的缺點,利用了分段鎖的思想

iterator

Iterator role in Multithreading

Iterator are used to iterate over the collection of objects and provide the their references. It is important to understand the behavior of iterator when some other thread does the modification on the object (or Concurrent modification) which is the part of collection and being iterated over by this thread. This kind of concurrent modification may leave the impact of dirty read, phantom read, etc. Collections based on their concurrent, synchronize and non synchronize behavior, it came up different kinds of iterators.

Fail-fast iterators
Collection iterator are used to traverse elements of a collection. Fail-fast iterator throws “ConcurrentModificationException” while iterating through the collection, if at the same time another thread does the modification. However, it doesn’t mean that it saves you from the arbitratory behavior of collection. Because, as per Oracle docs, Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. It is not recommended to write program that bases the “ConcurrentModificationException”.

The following test program mimics a situation that throws ConcurrentModificationException:

import java.util.*;

/**
 * This test program illustrates how a collection's iterator fails fast
 * and throw ConcurrentModificationException
 * @author www.codejava.net
 *
 */
public class IteratorFailFastTest {

    private List<Integer> list = new ArrayList<>();

    public IteratorFailFastTest() {
        for (int i = 0; i < 10_000; i++) {
            list.add(i);
        }
    }

    public void runUpdateThread() {
        Thread thread1 = new Thread(new Runnable() {

            public void run() {
                for (int i = 10_000; i < 20_000; i++) {
                    list.add(i);
                }
            }
        });

        thread1.start();
    }


    public void runIteratorThread() {
        Thread thread2 = new Thread(new Runnable() {

            public void run() {
                ListIterator<Integer> iterator = list.listIterator();
                while (iterator.hasNext()) {
                    Integer number = iterator.next();
                    System.out.println(number);
                }
            }
        });

        thread2.start();
    }

    public static void main(String[] args) {
        IteratorFailFastTest tester = new IteratorFailFastTest();

        tester.runIteratorThread();
        tester.runUpdateThread();
    }
}

Snapshot iterators
Snapshot iterator makes copy of the internal data structure (object collection) and iterates over the copied data structure. Any structural modification done to the iterator affects the copied data structure. So, original data structure remains structurally unchanged .Hence, no “ConcurrentModificationException” throws by the snapshot iterator. Snapshot iterators are fail-safe iterator. Copy-on-write collections use Snapshot iterators to iterate over the elements.
weakly-consistent iterators
Weakly-consistent iterators reflect some but not necessarily all of the changes that have been made to their backing collection since they were created. For example, if elements in the collection have been modified or removed before the iterator reaches them, it definitely will reflect these changes, but no such guarantee is made for insertions. Collections which rely on CAS(compare-and-swap) have weakly consistent iterators.
Undefined iterators
The results of modifying a collection during iteration are undefined and may result in inconsistencies. The examples here are the legacy collections Vector and Hashtable and their methods that return Enumeration, including Vector.elements, Hashtable.elements, and Hashtable.keys.
Fail-safe iterator is a myth
Java specification does not use the term “Thread safe” anywhere. However, Snapshot iterator and Weakly-consistent iterators can be considered as fail safe iterator.

CAS演算法

compare and swap
前言:加鎖的方式一般分為“樂觀鎖”和“悲觀鎖”,而CAS就屬於樂觀鎖的一種。我們常見的synchronized就是悲觀鎖。
【原網站佛系翻譯】
One of the best additions in java 5 was Atomic operations supported in classes such as AtomicInteger, AtomicLong etc. These classes help you in minimizing the need of complex (un-necessary) multi-threading code for some basic operations such as increment or decrement a value which is shared among multiple threads. These classes internally rely on an algorithm named CAS (compare and swap). In this article, I am going to discuss this concept in detail.

java5新增的原子操作,能使多執行緒中,基本的增值、減值操作的開銷最小化(前提是這個值是被多個執行緒共享,有執行緒安全問題的)。實現這些技術的類,實際上就是利用了CAS演算法。

Traditional locking mechanisms, e.g. using synchronized keyword in java, is said to be pessimistic technique of locking or multi-threading. It asks you to first guarantee that no other thread will interfere in between certain operation (i.e. lock the object), and then only allow you access to any instance/method.
It’s much like saying “please close the door first; otherwise some other crook will come in and rearrange your stuff”.

傳統的加鎖機制,例如java中的synchronized關鍵字,就是一種悲觀的加鎖技術。因為它首先要求你保證其他的執行緒不會干擾目前的操作,才讓你進入這個方法(或者說才讓你拿到這個例項)
(用形象的話就是:)“請先關上門,否則騙子會進來搗亂你的事情”

Though above approach is safe and it does work, but it put a significant penalty on your application in terms of performance. Reason is simple that waiting threads can not do anything unless they also get a chance and perform the guarded operation.

儘管上面的方法是安全並且是能夠工作的,但是對於應用的效能造成了很大的傷害。原因很簡單,正在等待的執行緒不能做任何事除非它們得到了機會,並執行了被保護的操作。

There exist one more approach which is more efficient in performance, and it optimistic in nature. In this approach, you proceed with an update, being hopeful that you can complete it without interference. This approach relies on collision detection to determine if there has been interference from other parties during the update, in which case the operation fails and can be retried (or not).
存在其他的在效能上更優的方法,(這種方法)實際上是樂觀的。這種方法裡,你執行某個更新,並希望在完成之前不受打擾。這種方式依賴衝突檢測,衝突檢測用來確定在執行更新的操作過程中,是否被其他實體打擾,如果被打擾了,那麼操作就失敗,並重試。

The optimistic approach is like the old saying, “It is easier to obtain forgiveness than permission”, where “easier” here means “more efficient”.
這種樂觀的方式就像一句古話所說,“寬容比允許更加容易”,這兒的容易指的是“更有效率”。

Compare and Swap is a good example of such optimistic approach, which we are going to discuss next.
CAS演算法就是一種樂觀法的好例子,下面我們會繼續討論。

Compare and Swap Algorithm
This algorithm compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value. This is done as a single atomic operation.
這種演算法將某個記憶體地址的值與給定的值做比較,如果是一樣的,那麼就會將這個記憶體地址的值更新。這是作為單個原子操作完成的。
The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail.
原子性保證了新值是基於最新的資訊計算的,如果同時這個值被另外的執行緒更新,那麼寫操作就會失敗。
The result of the operation must indicate whether it performed the substitution; this can be done either with a simple Boolean response (this variant is often called compare-and-set), or by returning the value read from the memory location (not the value written to it).
操作的寄過必須指明它是否執行了置換操作,這能通過簡單的布林值反應,或者通過返回記憶體地址的值。
There are 3 parameters for a CAS operation:
cas操作裡有三個引數
1. A memory location V where value has to be replaced
將被替換的記憶體地址V
2. Old value A which was read by thread last time
上一次被記憶體讀到的舊值A
3. New value B which should be written over V
將寫入地址V的新值B

Let’s understand thw whole process with an example. Assume V is a memory location where value “10” is stored. There are multiple threads who want to increment this value and use the incremented value for other operations, a very practical scenario. Let’s break the whole CAS operation in steps:
下面我們用一個例子去討論整個過程,假設V地址的值為10。有多個執行緒想去增加這個值,並用增加後的值去執行別的操作。讓我們把整個操作一步步分解。

1) Thread 1 and 2 want to increment it, they both read the value and increment it to 11.
執行緒1,2都讀到了這個值,並想要增加這個值到11
V = 10, A = 0, B = 0
2) Now thread 1 comes first and compare V with it’s last read value:
執行緒1首先完成,並在更新前進行比較,此時V的值是否與上次讀到的值(10)相同
V = 10, A = 10, B = 11

if     A = V
   V = B
 else
   operation failed
   return V

Clearly the value of V will be overwritten as 11, i.e. operation was successful.
此時比較結果當然是相同的,因而V的值能增加,操作成功。
3) Thread 2 comes and try the same operation as thread 1
接下來,執行緒2也完成了操作,但在更新前比較,發現如今V的值為11,不等於之前的10,因此操作不成功。
V = 11, A = 10, B = 11

if     A = V
   V = B
 else
   operation failed
   return V

4) In this case, V is not equal to A, so value is not replaced and current value of V i.e. 11 is returned. Now thread 2, again retry this operation with values:
這種情況下,執行緒2重新進行操作,第二次操作前再次讀最新的值,為11,操作完成後做比較,值未變,因此更新成功。
V = 11, A = 11, B = 12

And this time, condition is met and incremented value 12 is returned to thread 2.
In summary, when multiple threads attempt to update the same variable simultaneously using CAS, one wins and updates the variable’s value, and the rest lose. But the losers are not punished by suspension of thread. They are free to retry the operation or simply do nothing.
Thats all for this simple but important concept related to atomic operations supported in java.