多個執行緒對同一socket同時進行send操作的結果

1. 概覽

1.1 起因

自己寫的專案裡，為了保證連線不中斷，我起一個執行緒專門傳送心跳包保持連線，那這個執行緒在send傳送資料時，可能會與主執行緒中的send衝突，因此我就想探討一下socket api是否具有執行緒安全性。網上很多說法，但多是推測，於是我結合man pages、StackOverflow和大佬們的部落格等資料，做了簡單的實驗測試一下，用事實說話。

1.2 探究的主要問題和結論預告

以下問題是主要關注Linux tcp，所有結論都是Linux環境下的, 但是也會有UDP, Windows C++，C++ boost庫和java語言等StackOverflow上相同問題下的資料和連結。

當兩個執行緒或程序同時對同一socket進行send（write）操作時，會不會出問題，例如兩個執行緒的內容會不會交錯， tcp和udp是否情況相同
- tcp情況下會有問題：當某執行緒傳送的資料量過大時（實驗測試為一次性發送32KB），兩個執行緒傳送的內容出現了交錯的情況；當資料量比較小時（測試為4B），內容沒有交錯。（我並沒有過多關注資料量大小的設定，總之可能出現交錯，所以儘量避免這樣使用）
- udp未測試，但是所有收集的資料都表示，在UDP情況下不存在這種問題
send或write是否執行緒安全
- 執行緒安全性：從實驗結果上看，不具備執行緒安全

關於socket的執行緒安全, 也有人認為是具備的，他們的意思是：可以一個執行緒傳送，另一個執行緒同時接收，就叫執行緒安全，這個我是知道的，但是這與我想討論的情況不一樣，所以需要明確我說的socket執行緒安全的定義：兩個執行緒同時傳送時的執行緒安全性。

1.3 文章組織

首先我會介紹實驗測試，包括測試思路，程式碼（比較簡單），和測試結果
根據上述實驗結果的事實，去探究其問題出現的原因
關於原子性和其他一些觀點、平臺、程式語言的探討分析
最後，會給一些建議和替代方案

2. 實驗測試

測試環境：VMware station 15，虛擬機器Ubuntu server 18.04.4
測試程式碼地址：https://github.com/whuwzp/linuxc/tree/master/bug/twothread_send

2.1 測試思路

測試思路很簡單，一端起兩個執行緒同時迴圈傳送資料（一個執行緒發一串aaaa，另一個發一串bbbb），另一端接收並列印，看看結果會不會有交錯的。

關於內容交錯，我並不關心資料達到的順序，只關心一個大的資料包會不會穿插著另一個數據包，具體來說：兩個執行緒Ta，Tb，有一個socket S，Ta和Tb同時向S傳送資料，其中Ta的要傳送資料很大（需要分片），由Pa1和Pa2兩部分組成；Tb傳送資料較小，為Pb1，那麼如果對端收到的是：

Pa1 Pa2 Pb1或者Pb1 Pa1 Pa2：則認為沒有交錯（Ta和Tb的資料各自都連續到達，只是順序未知而已）
Pa1 Pb1 Pa2 ：則認為有交錯。（因為Ta資料包被隔斷不連續了）

注意：測試時候某個執行緒傳送的資料包一定要足夠大，否則不會出現交錯的現象，我之前就是傳送太小，誤以為不會交錯，原因在第三節的分析中會講到。

2.2 測試程式碼

測試程式碼地址：https://github.com/whuwzp/linuxc/tree/master/bug/twothread_send

server.cpp, 以下簡要展示關鍵程式碼:

typedef struct {//為了更好地展示，我傳送的是結構體，用index標記是第幾個a或者b
    char buf;
    int  index;
} myint;

//Ta
void * athread(void *) {
    for (i = 0; i < 4 * 1000; i++) {
        buf[i].buf = 'a';
        buf[i].index = i;//標記是第多少個'a'
    }
    //迴圈傳送, 每次傳送4000個結構體資料
    while (true)  ret = write(fd_client, buf, 1000 * 4 * sizeof(myint));
}

//Tb
void *bthread(void *) {
    for (i = 0; i < 100; i++) {
        buf[i].buf = 'b';
        buf[i].index = i;//標記是第多少個'b'
    }
    while (true) {
        ret = 0;
        for (int i = 0; i < 10; ++i) {//迴圈傳送10次100個結構體
            ret += write(fd_client, buf, 100*sizeof(myint));
        }
        sleep(1);
    }
}

client.cpp

 while (true) {
	memset(buf, 0, 4 * 1000 * sizeof(myint));
	ret = (int)read(fd_client, buf, 4 * 1000 * sizeof(myint));
	for (i = 0; i < ret/sizeof(myint); i++) {
    //因為4000個a列印起來很多,所以只打印b,或者接收資料不完全的情況
        if (buf[i].buf == 'b' || ret != 4 * 1000 * sizeof(myint)) {
            printf("ret=%d\n", ret);
            for (i = 0; i < ret/sizeof(myint); i++){
                printf("%c, %d\n", buf[i].buf, buf[i].index);//列印其索引
                break;
            }
        }
 	}
}

執行:

./server
./client > result.txt

2.3 實驗結果

輸出結果result.txt

# 此處省略a0~a3391
a, 3389
a, 3390
a, 3391
# 此處是重點, 內容交錯的起點
b, 0
b, 1
# 此處省略b0~b99, 10次迴圈
b, 98
b, 99
a, 3392
a, 3393
# 此處省略a3394~a3999,以及b的後續

通過結果可以看出:

先接收了Ta傳送的0~3391資料包
再接收Tb接收的10次0~100資料包
最後再又接收了Ta的3392~3999資料包

也就是說, Ta與Tb的資料包發生了交錯。

3. 原因探究和實驗結果分析

3.1 原子操作性和執行緒安全性

首先這兩個概念是有區別的（詳見stackoverflow上的討論）：

原子性：意味著從一個操作開始的所有位元組一起結束，而不會與其他I / O操作交織（這是ibm文章摘自IEEE Std 1003.1-2001 System Interfaces volume中的定義）。也有的人說：要麼100%操作成功，要麼失敗了就把狀態回滾到操作之前，或者不可分割。百度百科-原子操作。
執行緒安全：多執行緒同時進行一個操作而不會相互影響。原子操作是實現執行緒安全的一種方式。(摘自stackoverflow上的討論

3.2 socket api是否執行緒安全與c/c++語言本身無關

socket不是c/c++語言標準, 所以它的執行緒安全依賴具體的實現. 每個系統不同, 有的系統實現socket時用了鎖來保護內部資料結構, 有的系統可能沒有, 所以得看系統具體實現.

Sockets are not part of C++ Standard so it depends on implementation. Generally they are not thread safe since send is not an atomic operation. Check this discussion for additional information.
EDIT: OS could have or couldn't have internal lock for protecting internal structures. It depends on implementation. So you should not count on it.
摘自: https://stackoverflow.com/questions/2354417/c-socket-api-is-thread-safe

3.3 可能原因分析

以下內容參考：https://quark.tistory.com/m/235 （這個連結不是原文章的連結，原文連結已經失效了，這個是韓國網站上轉載的），如果訪問慢，可以看我轉載的：https://www.cnblogs.com/whuwzp/articles/thread_safe.html。

簡單解釋一下：

send函式最終呼叫核心的tcp_sendmsg()函式
tcp_sendmsg()函式需要通過lock_sock()獲取一個鎖，然後開始迴圈分配緩衝區空間，並把待發送的資料拷貝過去，準備傳送
如果待發送的資料特別大（這裡回答了為什麼傳送32kb那麼大的資料）導致迴圈分配空間到某個階段失敗了（部分資料已經成功了），就會最終呼叫sk_stream_wait_memory()函式等待空間可用，而該函式內部會呼叫sk_wait_event()去釋放鎖，然後阻塞等待著
一旦這個執行緒釋放了鎖，那麼其他執行緒的send函式就可能通過lock_sock()競爭到鎖，然後傳送他們的資料，這樣兩個執行緒的資料流就會交錯在一起了。

這個解釋非常符合實驗結果。但是這只是一種可能，我們只能判斷一定是釋放了鎖，且被別的執行緒競爭到了鎖，是不是隻有記憶體分配失敗才會導致這樣呢？我也不太確定，希望讀者一起讀第四節，一起探討。

4. 其他的觀點探討和分析

4.1 原子性的問題

其實到現在我仍然不能確定send、write是否具備原子性，所以這裡暫時只是分享一些觀點，我個人稍微傾向於相信它不是原子性的，但是也希望大佬多指教：

因為IBM文章的實驗和我自己的實驗都表明並不是“一起結束的”。因為部分資料已經發送了，另一部分最後才傳送。
認為是原子性的觀點沒有給出實質的論據。

4.1.1 認為POSIX標準下不是原子性的觀點和依據

以下摘自IBM文章

the POSIX/SUSv3 standard developers indicate that atomicity for socket I/O is "unspecified".

Why not atomic?

Using Linux kernel 2.6.11 as a reference, because it's the latest kernel processed for easy web cross-referencing at LXR

When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects itself by acquiring a lock at invocation by calling lock_sock(). tcp_sendmsg() then loops over the buffers in the iovec, allocating associated sk_buff's and cache pages for use in the actual send. As it does so, it pushes the data out to tcp for actual transmission. However, if one of those allocation fails (because a large number of large sends is being processed, for example), it must wait for memory to become available. It does so by jumping to wait_for_sndbuf or wait_for_memory, both of which eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory() contains a code path that calls sk_wait_event(). Finally, sk_wait_event() contains the call to release_sock().

At this point, any one of the threads that were heretofore serialized at the initial call to lock_sock() in tcp_sendmsg() can proceed. Memory may either become available, or a small enough send may not require enough memory to block and may proceed immediately, thus intermixing data from one call to send() with another.

but in the definition of read() in the IEEE Std 1003.1-2001 System Interfaces volume, it gives the following rationale: "The standard developers considered adding atomicity requirements..."

4.1.1 認為POSIX標準下是原子性的觀點和依據

以下認為send是系統呼叫, 是原子操作, 不會在核心中有競爭. 但是這個和IBM文章就有矛盾了.IBM文章的分析應該是有競爭的(當多個執行緒同時send). 雖然他認為是原子的, 但也認為多個執行緒同時傳送也要保證同步問題, 以免內容交錯.

A lock is not required; send() is a syscall, it is an atomic operation with no race conditions in the kernel.For stream sockets (TCP) too, the send() function is atomic; but there is no concept of distinct messages or packets, the data treated as a single stream of bytes. So even though send() is thread-safe, synchronisation is required to ensure that the bytes from different send calls are merged into the byte stream in a predictable manner.

摘自: https://stackoverflow.com/questions/1981372/are-parallel-calls-to-send-recv-on-the-same-socket-valid/61246416#61246416

以下的觀點也認為是原子的, 但是他說send的呼叫總是完全成功, 或者完全失敗, 但是實際上send是可以返回小於請求值的（也就是小於待發送的資料長度）.

Sorry, you're wrong. Send is 100% an atomic operation. If the data is too large and won't fit in the output buffer, then the call will block. If the call to send is interrupted, then no data is sent and an error is returned. Calls to send always completely succeed, or completely fail.

摘自: https://stackoverflow.com/questions/3235424/c-sockets-send-thread-safety

4.2 Linux man pages的觀點真的錯了嗎

以下摘自：Linux man pages http://man7.org/linux/man-pages/man2/send.2.html

If space is not available at the sending socket to hold the message to be transmitted, and the socket file descriptor does not have O_NONBLOCK set, send() shall block until space is available.

When the message does not fit into the send buffer of the socket, send() normally blocks, unless the socket has been placed in nonblocking I/O mode. In nonblocking mode it would fail with the error EAGAIN or EWOULDBLOCK in this case. The select(2) call may be used to determine when it is possible to send more data.

意思是如果socket緩衝區不能裝下待發送的資料，則send函式會阻塞直到記憶體可用，然後傳送。

上一節ibm文章依據實驗結果，認為並沒有阻塞，從而判斷man pages的這一觀點是錯的。

我的觀點上節也說了：但是這只是一種可能，我們只能判斷一定是釋放了鎖，且被別的執行緒競爭到了鎖，是不是隻有記憶體分配失敗才會導致這樣呢？我也不太確定

4.3 UDP會不會遇到和tcp一樣的問題

據所有收集的資料的說法：不會。

因為udp是資料報大小受限（UDP不具備分片能力，或者說不具備分片後，對端正確組包的能力），因此不像tcp傳送這麼大的資料，他們的操作會很快傳送（不會像前面分析的，還會等待記憶體分配而阻塞），因此“看上去”udp的操作是原子的。

以下是支援這個觀點的資料：

Note that datagram sockets (UDP) are connection-less, so the the datagram packets may be delivered in a sequence that is different from the sequence in which they were sent (even if all the send calls were from within a single thread).

摘自：https://stackoverflow.com/questions/1981372/are-parallel-calls-to-send-recv-on-the-same-socket-valid/61246416#61246416

Q: can I concurrently have two threads at the same time write (sendto) on a same file descriptor? How about readfrom and sendt on the same file descriptor at the same time?

A: Yes, but assume two threads A (writes a one message containing "aaaa")
and B which (writes two messages "b" and "c").

You can be assured that B's data won't show up inside A's message:
ie. never aabaa
but you can not assume anything about the order they will be received.

摘自: https://linux-newbie.vger.kernel.narkive.com/wdAJ3Kzt/is-sendto-and-recvfrom-thread-safe#post4

對於 UDP，多執行緒讀寫同一個 socket 不用加鎖，不過更好的做法是每個執行緒有自己的 socket，避免 contention，可以用 SO_REUSEPORT 來實現這一點。

對於 TCP，通常多執行緒讀寫同一個 socket 是錯誤的設計，因為有 short write 的可能。假如你加鎖，而又發生 short write，你是不是要一直等到整條訊息傳送完才解鎖（無論阻塞IO還是非阻塞IO）？如果這樣，你的臨界區長度由對方什麼時候接收資料來決定，一個慢的 peer 就把你的程式搞死了。

總結：對於 UDP，加鎖是多餘的；對於 TCP，加鎖是錯誤的。

摘自知乎陳碩大佬的回答: https://www.zhihu.com/question/56899596/answer/150926723

4.4 fwrite是不是執行緒安全的

fwrite是c/c++語言庫實現的，它最終是呼叫write系統呼叫，但是做了一些封裝，關於FILE*等可以看我之前的內容：https://www.cnblogs.com/whuwzp/p/stdin_stdout_fflush_file.html。
另外這篇文章用實驗證明了fwrite的執行緒安全性：https://cloud.tencent.com/developer/article/1412015 （但是我和他關於write的執行緒安全性的實驗的結果是不一樣的，我測試是也具備執行緒安全性的，即使不加APPEND模式）

在同一個程序內, 針對同一個FILE*的操作(比如fwrite), 是執行緒安全的. 當然這隻在POSIX相容的系統上成立, Windows上的FILE*的操作並不是執行緒安全的.

http://gcc.gnu.org/onlinedocs/libstdc++/manual/using_concurrency.htmlAs a n example, the POSIX standard requires that C stdio FILE* operations are atomic. POSIX-conforming C libraries (e.g, on Solaris and GNU/Linux) have an internal mutex to serialize operations on FILE*s. However, you still need to not do stupid things like calling fclose(fs) in one thread followed by an access of fs in another.

摘自知乎大神egmkang wang的回答： https://www.zhihu.com/question/40472431/answer/87077691

It depends upon which primitives you're using to submit data to the socket.

If you're using write(2), send(2), sendto(2), or sendmsg(2) and the size of your message is small enough to fit entirely within the kernel buffers for the socket, then that entire write will be sent as a block without other data being interspersed.

摘自：https://stackoverflow.com/questions/7942595/linux-c-c-socket-send-in-multi-thread-code

4.5 Windows 平臺，java語言，c++ boost庫相關討論

Winsock環境下UDP不存線上程安全問題而tcp存在的資料：https://stackoverflow.com/questions/13983398/is-winsock2-thread-safe
Windows 下使用多個執行緒send，結論是無法保證順序，以及內容會不會交錯：https://tangentsoft.net/wskfaq/intermediate.html#threadsafety
一個boost的測試，結論是boost的不是執行緒安全的（多個執行緒同時send）：
https://stackoverflow.com/questions/11581978/write-to-boostasio-socket-from-different-threads
java裡面的，高概率資料交錯，除非兩個執行緒自有順序（不同時），或者寫入是原子操作，不然很可能交錯：https://stackoverflow.com/questions/18910022/can-two-threads-use-the-same-socket-at-the-same-time-and-are-there-possible-pro

5. 建議和替代方案

無論什麼平臺，都不要嘗試多執行緒send。

Linux下：改成訊息佇列模型，多個執行緒把資料傳送到佇列中（佇列用mutex等保證執行緒安全），然後僅由一個執行緒負責取出訊息佇列中的訊息，傳送給對端。可以參考libevent和這篇文章http://pl.atyp.us/content/tech/servers.html。
Windows下：asynchronous IO or overlapped IO （這個不太熟）

6. 參考網址

IBM文章：原文已失效，轉載的：https://www.cnblogs.com/whuwzp/articles/thread_safe
UDP不存線上程安全問題的資料：https://linux-newbie.vger.kernel.narkive.com/wdAJ3Kzt/is-sendto-and-recvfrom-thread-safe#post4
Winsock環境下UDP不存線上程安全問題而tcp存在的資料：https://stackoverflow.com/questions/13983398/is-winsock2-thread-safe
Windows 下使用多個執行緒send，結論是無法保證順序，以及內容會不會交錯：https://tangentsoft.net/wskfaq/intermediate.html#threadsafety
一個boost的測試，結論是boost的不是執行緒安全的（多個執行緒同時send）：
https://stackoverflow.com/questions/11581978/write-to-boostasio-socket-from-different-threads
java裡面的，高概率資料交錯，除非兩個執行緒自有順序（不同時），或者寫入是原子操作，不然很可能交錯：https://stackoverflow.com/questions/18910022/can-two-threads-use-the-same-socket-at-the-same-time-and-are-there-possible-pro
和我相同問題的：https://stackoverflow.com/questions/1457256/is-it-safe-to-issue-blocking-write-calls-on-the-same-tcp-socket-from-multiple
原子和執行緒安全的討論: https://softwareengineering.stackexchange.com/questions/178898/difference-between-atomic-operation-and-thread-safety
原子和執行緒安全的討論: https://stackoverflow.com/questions/12347236/which-is-threadsafe-atomic-or-non-atomic
知乎陳碩大佬的關於該問題加鎖解決的方法的評價: https://www.zhihu.com/question/56899596/answer/150926723
知乎大神egmkang wang的回答fwrite執行緒安全性： https://www.zhihu.com/question/40472431/answer/87077691

刨根問底系列（3）——關於socket api的原子操作性和執行緒安全性的探究和實驗測試（多執行緒同時send，write）