併發與多執行緒基礎之執行緒之間共享資料

阿新 • • 發佈：2019-01-16

1、共享資料帶來什麼問題？

A、條件競爭：併發中競爭條件的形成，取決於一個以上執行緒的相對執行順序，每個執行緒都搶著完成自己的任務。大多數情況下，即使改變執行順序，也是良性競爭，其結果可以接受。例如，有兩個執行緒同時向一個處理佇列中新增任務，因為系統提供的不變數保持不變，所以誰先誰後都不會有什麼影響。當不變數遭到破壞時，才會產生條件競爭，比如雙向連結串列的例子。併發中對資料的條件競爭通常表示為惡性條件競爭。

B、避免惡性條件競爭：這裡提供一些方法來解決惡性條件競爭，最簡單的辦法就是對資料結構採用某種保護機制，
確保只有進行修改的執行緒才能看到不變數被破壞時的中間狀態。從其他訪問執行緒的角度來
看，修改不是已經完成了，就是還沒開始。

2、使用互斥量保護共享資料

A、C++中通過例項化 std::mutex 建立互斥量，通過呼叫成員函式lock()進行上鎖，unlock()進行解鎖。不過，不推薦實踐中直接去呼叫成員函式，因為呼叫成員函式就意味著，必須記住在每個函數出口都要去呼叫unlock()，也包括異常的情況。C++標準庫為互斥量提供了一個RAII語法的模板類 std::lock_guard ，其會在構造的時候提供已鎖的互斥量，並在析構的時候進行解鎖，從而保證了一個已鎖的互斥量總是會被正確的解鎖。

#include <list>
#include <mutex>
#include <algorithm>
std::list<int> some_list; // 1
std::mutex some_mutex; // 2
void add_to_list(int new_value)
{
    std::lock_guard<std::mutex> guard(some_mutex); // 3
    some_list.push_back(new_value);
}
bool list_contains(int value_to_find)
{
    std::lock_guard<std::mutex> guard(some_mutex); // 4
    return std::find(some_list.begin(),some_list.end(),value_to_find) !=
                    some_list.end();
}

全域性變數①，這個全域性變數被一個全域性的互斥量保護②。add_to_list()③和list_contains()④函式中使用 std::lock_guard<std::mutex> ，使得這兩個函式中對資料的訪問是互斥的：list_contains()不可能看到正在被add_to_list()修改的列表。

B、使用互斥量來保護資料，並不是僅僅在每一個成員函式中都加入一個 std::lock_guard 物件那麼簡單；一個迷失的指標或引用，將會讓這種保護形同虛設。

class some_data
{
	int a;
	std::string b;
	public:
	void do_something();
};
class data_wrapper
{
private:
	some_data data;
	std::mutex m;
public:
	template<typename Function>
	void process_data(Function func)
	{
		std::lock_guard<std::mutex> l(m);
		func(data); // 1 傳遞“保護”資料給使用者函式
	}
};
some_data* unprotected;
void malicious_function(some_data& protected_data)
{
	unprotected=&protected_data;
}
data_wrapper x;
void foo()
{
	x.process_data(malicious_function); // 2 傳遞一個惡意函式
	unprotected->do_something(); // 3 在無保護的情況下訪問保護資料
}

例子中process_data看起來沒有任何問題， std::lock_guard 對資料做了很好的保護，但呼叫使用者提供的函式func①，就意味著foo能夠繞過保護機制將函式 malicious_function 傳遞進去②，在沒有鎖定互斥量的情況下呼叫 do_something() 。可能使得我們想要保護的資料遭到破壞。

C、發現介面內在的條件競爭：下面例子使用vector實現了一個棧。兩個執行緒輪流從中彈出元素。

#include <iostream>  
#include <thread>  
#include <mutex>  
#include <string>  
#include <vector>  
  
std::mutex myMutex;  
  
class Stack  
{  
public:  
    Stack() {};  
    ~Stack() {};  
    void pop();  
    int top() { return data.back(); }  
    void push(int);  
    void print();  
    int getSize() { return data.size(); }  
private:  
    std::vector<int> data;  
};  
  
void Stack::pop()  
{  
    std::lock_guard<std::mutex> guard(myMutex);  
    data.erase(data.end()-1);  
}  
  
void Stack::push(int n)  
{  
    std::lock_guard<std::mutex> guard(myMutex);  
    data.push_back(n);  
}  
  
void Stack::print()  
{  
    std::cout << "initial Stack : " ;  
    for(int item : data)  
        std::cout << item << " ";  
    std::cout << std::endl;  
}  
  
void process(int val, std::string s)  
{  
    std::lock_guard<std::mutex> guard(myMutex);  
    std::cout << s << " : " << val << std::endl;  
}  
  
void thread_function(Stack& st, std::string s)  
{  
    int val = st.top();  
    st.pop();  
    process(val, s);  
}  
  
int main()  
{  
    Stack st;  
    for (int i = 0; i < 10; i++)    
        st.push(i);  
  
    st.print();  
  
    while(true) {  
        if(st.getSize() > 0) {  
            std::thread t1(&thread_function, std::ref(st), std::string("thread1"));  
            t1.join();  
        }  
        else  
            break;  
        if(st.getSize() > 0) {  
            std::thread t2(&thread_function, std::ref(st), std::string("thread2"));  
            t2.join();  
        }  
        else  
            break;  
    }  
  
    return 0;  
}

執行後的結果之一：
initial Stack : 0 1 2 3 4 5 6 7 8 9
thread1 : 9
thread2 : 8
thread1 : 7
thread2 : 6
thread1 : 5
thread2 : 4
thread1 : 3
thread2 : 2
thread1 : 1
thread2 : 0

看上去這段程式碼是執行緒安全的。事實上並非如此。仍然有資源競爭存在，取決於執行的順序。如下所示：

元素"6"可能被執行兩次，且元素"5"被跳過了。
儘管從上面的執行結果看是正確的，但是程式碼中仍然存在可能觸發資源競爭的條件。換言之，這段程式碼不是執行緒安全的。
一種解決方法是將函式top()與pop()合併到一個mutex下面：

int stack::pop()  
{  
    lock_guard<mutex> guard(myMutex);  
    int val = data.back();  
    data.erase(data.end()-1);  
    return val;  
}  
  
  
void thread_function(stack& st, string s)  
{  
    int val = st.pop();  
    process(val, s);  
}

削減介面可以獲得最大程度的安全,甚至限制對棧的一些操作。棧是不能直接賦值的，因為賦值操作已經刪除了，並且這裡沒有swap()函式。棧可以拷貝的，假設棧中的元素可以拷貝。當棧為空時，pop()函式會丟擲一個empty_stack異常，所以在empty()函式被呼叫後，其他部件還能正常工作。如選項3描述的那樣，使用 std::shared_ptr 可以避免記憶體分配管理的問題，並避免多次使用new和delete操作。堆疊中的五個操作，現在就剩下三個：push(), pop()和empty()(這裡empty()都有些多餘)。簡化介面更有利於資料控制，可以保證互斥量將一個操作完全鎖住。下面的程式碼將展示一個簡單的實現——封裝 std::stack<> 的執行緒安全堆疊。

#include <exception>
#include <memory>
#include <mutex>
#include <stack>
struct empty_stack: std::exception
{
	const char* what() const throw() 
	{
		return "empty stack!";
	};
};
template<typename T>
class threadsafe_stack
{
private:
	std::stack<T> data;
	mutable std::mutex m;
public:
	threadsafe_stack()
		: data(std::stack<T>()){}
	threadsafe_stack(const threadsafe_stack& other)
	{
		std::lock_guard<std::mutex> lock(other.m);
		data = other.data; // 1 在建構函式體中的執行拷貝
	}
	threadsafe_stack& operator=(const threadsafe_stack&) = delete;

	void push(T new_value)
	{
		std::lock_guard<std::mutex> lock(m);
		data.push(new_value);
	}
	std::shared_ptr<T> pop()
	{
		std::lock_guard<std::mutex> lock(m);
		
		if(data.empty()) throw empty_stack(); // 在呼叫pop前，檢查棧是否為空
		
		std::shared_ptr<T> const res(std::make_shared<T>(data.top())); // 在修改堆疊前，分配出返回值
		data.pop();
		return res;
	}
	void pop(T& value)
	{
		std::lock_guard<std::mutex> lock(m);
		if(data.empty()) throw empty_stack();
		value=data.top();
		data.pop();
	}
	bool empty() const
	{
		std::lock_guard<std::mutex> lock(m);
		return data.empty();
	}
};

堆疊可以拷貝——拷貝建構函式對互斥量上鎖，再拷貝堆疊。建構函式體中①的拷貝使用互斥量來確保複製結果的正確性，這樣的方式比成員初始化列表好。

3、保護共享資料的替代設施

A、保護共享資料的初始化過程：假設有一個共享資料，初始化構建代價很昂貴，可能它會開啟一個數據庫連線，或者會分配出很多記憶體。延遲初始化時一個優化程式碼的方法，在使用的時候去判斷其是否已經初始化，然後再決定使用。

一般情況下：

std::shared_ptr<some_resource> resource_ptr;
std::mutex resource_mutex;
void foo()
{
    std::unique_lock<std::mutex> lk(resource_mutex); // 所有執行緒在此序列化
    if(!resource_ptr)
    {
        resource_ptr.reset(new some_resource); // 只有初始化過程需要保護
    }
    lk.unlock();
    resource_ptr->do_something();
}

這段程式碼相當常見了，也足夠表現出沒必要的執行緒化問題，很多人能想出更好的一些的辦法來做這件事，包括聲名狼藉的雙重檢查鎖模式：

void undefined_behaviour_with_double_checked_locking()
{
    if(!resource_ptr) // 1
    {
        std::lock_guard<std::mutex> lk(resource_mutex);
        if(!resource_ptr) // 2
        {
            resource_ptr.reset(new some_resource); // 3
        }
    }
    resource_ptr->do_something(); // 4
}

指標第一次讀取資料不需要獲取鎖①，並且只有在指標為NULL時才需要獲取鎖。然後，當獲取鎖之後，指標會被再次檢查一遍② (這就是雙重檢查的部分)，避免另一的執行緒在第一次檢查後再做初始化，並且讓當前執行緒獲取鎖。

這個模式為什麼聲名狼藉呢？因為這裡有潛在的條件競爭，未被鎖保護的讀取操作①沒有與其他執行緒裡被鎖保護的寫入操作③進行同步。因此就會產生條件競爭，這個條件競爭不僅覆蓋指標本身，還會影響到其指向的物件；即使一個執行緒知道另一個執行緒完成對指標進行寫入，它可能沒有看到新建立的some_resource例項，然後呼叫do_something()④後，得到不正確的結果。這個例子是在一種典型的條件競爭——資料競爭， C++ 標準中這就會被指定為“未定義行為”。這種競爭肯定是可以避免的。

C++標準委員會也認為條件競爭的處理很重要，所以C++標準庫提供了 std::once_flag 和 std::call_once 來處理這種情況。比起鎖住互斥量，並顯式的檢查指標，每個執行緒只需要使用 std::call_once ，在 std::call_once 的結束時，就能安全的知道指標已經被其他的執行緒初始化了。

D、使用 std::call_once 作為類成員的延遲初始化(執行緒安全)

class X
{
private:
	connection_info connection_details;
	connection_handle connection;
	std::once_flag connection_init_flag;
	void open_connection()
	{
	   connection=connection_manager.open(connection_details);
	}
public:
	X(connection_info const& connection_details_):
	connection_details(connection_details_)
	{}
	void send_data(data_packet const& data) // 1
	{
		std::call_once(connection_init_flag,&X::open_connection,this);// 2
		connection.send_data(data);
	}
	data_packet receive_data() // 3
	{
		std::call_once(connection_init_flag,&X::open_connection,this);// 2
		return connection.receive_data();
	}
};

例子中第一個呼叫send_data()①或receive_data()③的執行緒完成初始化過程。使用成員函式open_connection()去初始化資料，也需要將this指標傳進去。和其在在標準庫中的函式一樣，其接受可呼叫物件，比如 std::thread 的建構函式和 std::bind() ，通過向 std::call_once() ②傳遞一個額外的引數來完成這個操作。
值得注意的是， std::mutex 和 std::one_flag 的例項就不能拷貝和移動，所以當你使用它們作為類成員函式，如果你需要用到他們，你就得顯示定義這些特殊的成員函式。

併發與多執行緒基礎之執行緒之間共享資料

併發與多執行緒基礎之執行緒之間共享資料

Java多執行緒基礎之執行緒特性

combineReducers 進階之不同 reducers 之間共享資料

Java多執行緒基礎之物件鎖的同步與非同步

併發程式設計（一）：執行緒基礎、執行緒之間的共享與協作

高併發與多執行緒的關係、區別、高併發的技術方案

Java多執行緒基礎之手撕生產者和消費者模式

C++11 併發與多執行緒篇（未完成）

多執行緒程式設計之執行緒基礎

Java多執行緒基礎之停止執行緒

Java多執行緒基礎之資料共享引發的“非執行緒安全”

執行緒基礎之資料競爭與鎖

python 多程序併發與多執行緒併發總結

java 多執行緒基礎之銀行取號排隊系統

java併發-多執行緒之多個執行緒之間共享資料（6）

多執行緒基礎之四：Linux提供的原子鎖型別atomic_t

java併發與多執行緒API學習

C語言多執行緒基礎-01-執行緒的建立與銷燬

【本人禿頂程式設計師】你分得清分散式、高併發與多執行緒嗎？

如何分清分散式、高併發與多執行緒嗎？

併發與多執行緒基礎之執行緒之間共享資料

相關推薦