SGI STL中string的原始碼解讀（3）

阿新 • • 發佈：2019-01-25

7. replace函式

replace函式是basic_string中一個最重要的函式，很多操作都是直接/間接通過replace完成，包括insert，erase，assignment等等。Repalce函式在basic_string中有多個過載的形式。下面開始分析repalce函式。由於repalce函式呼叫了其他的函式，還是現從被呼叫的函式開始出發。

在下面的描述中，原字串主要是指被替換的字串（即要被修改的字串）。

1. _M_mutate函式

_M_mutate函式主要是用於判斷從__pos開始，用長度為__len2的串替換長度為__len1的串，是否進行記憶體的分配。

Void _M_mutate(size_type __pos, size_type __len1, size_type __len2)

{

const size_type __old_size = this->size();

//__new_size指的是替換以後字串的長度

const size_type __new_size = __old_size + __len2 - __len1;

//__how_much表示原字串末端保留下來字串的長度

const size_type __how_much = __old_size - __pos - __len1;

//if判斷主要是必須重新分配記憶體

if(_M_rep() == &_S_empty_rep() || _M_rep()->_M_is_shared() || __new_size > capacity())

{

const allocator_type __a = get_allocator();

_Rep* __r = _Rep::_S_create(__new_size, capacity(), __a);

//如果pos不為0（pos應該是大於0的），把原字串開頭到pos之間的子串copy到新串

if(__pos)

traits_type::copy(__r->_M_refdata(), _M_data(), __pos);

//如果how_much不為0，把原字串末端留下的子串copy到新串的末端

if(__how_much)

traits_type::copy(__r->_M_refdata() + __pos + __len2, _M_data() + __pos + __len1, __how_much);

//減去原字串的引用計數，並交換原串和新串

_M_rep()->_M_dispose(__a);

_M_data(__r->_M_refdata());

}

else if (__how_much && __len1 != __len2)

{

//else主要在不重新分配記憶體的情況下，並且需要移動原字串末端的字元

traits_type::move(_M_data() + __pos + __len2, _M_data() + __pos + __len1, __how_much);

}

_M_rep()->_M_set_sharable();

_M_rep()->_M_length = __new_size;

//很關鍵，要設定最後的結束標誌

_M_data()[__new_size] = _Rep::_S_terminal; // grrr. (per 21.3.4)

}

那麼_M_mutate函式執行結束以後，我們可以得到的結論是在字串中從__pos開始留下了長度為__len2的空白區間，等待填充。

2. _M_replace_safe函式

這個函式主要填充字串中從__pos開始留下了長度為__len2的空白區間。

basic_string&

_M_replace_safe(size_type __pos1, size_type __n1, const _CharT* __s, size_type __n2)

{

_M_mutate(__pos1, __n1, __n2);

if (__n2 == 1)

_M_data()[__pos1] = *__s;

else if (__n2)

traits_type::copy(_M_data() + __pos1, __s, __n2);

return *this;

}

3. _M_replace函式

有了上面的_M_replace_safe函式，則_M_replace非常容易完成。

basic_string<_CharT, _Traits, _Alloc>&

replace(size_type __pos, size_type __n1, const _CharT* __s, size_type __n2)

{

//判斷字串__s和它的長度__n2都是有效

__glibcxx_requires_string_len(__s, __n2);

//判斷__pos在原字串是一個合法的位置

_M_check(__pos, "basic_string::replace");

//_M_limit(__pos, __n)完成長度檢測，即__pos + n的距離不應該超過原字串的長度

__n1 = _M_limit(__pos, __n1);

//下面的if判斷主要防止字串太長了，超過了可表示的最大值

if (this->max_size() - (this->size() - __n1) < __n2)

__throw_length_error(__N("basic_string::replace"));

bool __left;

//判斷是否和其他字元物件共享，並且這兩個字串不重疊

if (_M_rep()->_M_is_shared() || less<const _CharT*>()(__s, _M_data())|| less<const _CharT*>()(_M_data() + this->size(), __s))

return _M_replace_safe(__pos, __n1, __s, __n2);

else if ((__left = __s + __n2 <= _M_data() + __pos) || _M_data() + __pos + __n1 <= __s)

{

//這個if判斷主要是判斷這兩個字串時候有重疊，如果沒有重疊執行下面的

const size_type __off = __s - _M_data();

_M_mutate(__pos, __n1, __n2);

if (__left)

traits_type::copy(_M_data() + __pos, _M_data() + __off, __n2);

else

traits_type::copy(_M_data() + __pos, _M_data() + __off + __n2 - __n1, __n2);

return *this;

}

else

{

//兩個字串有重疊的情況，先生成一個臨時物件

const basic_string __tmp(__s, __n2);

return _M_replace_safe(__pos, __n1, __tmp._M_data(), __n2);

}

4. _M_replace_aux函式

_M_replace_aux函式和_M_replace_safe函式非常相似。這個函式主要完成的是拷貝__n2個字元__C，所以有一點點區別（別的函式都是處理字串的）。

basic_string&

_M_replace_aux(size_type __pos1, size_type __n1, size_type __n2, _CharT __c)

{

if (this->max_size() - (this->size() - __n1) < __n2)

__throw_length_error(__N("basic_string::_M_replace_aux"));

_M_mutate(__pos1, __n1, __n2);

if (__n2 == 1)

_M_data()[__pos1] = __c;

else if (__n2)

traits_type::assign(_M_data() + __pos1, __n2, __c);

return *this;

}

5. replace函式小結

在basic_string中的其他過載的replace函式，有12個函式都是使用上面的replace函式，有兩個使用的上面的_M_replace_aux函式。

8. insert和erase函式

insert和erase函式都是藉助於replace函式實現的，也是比較簡單。

Insert函式：

Insert函式共有8個過載的形式，根據返回值可以分為3類，其中最為主要的是返回值為basic_string&。

1. 返回值為basic_string&的insert函式

這個insert完成的給定__pos插入長度為__n的字串__s。

basic_string&

insert(size_type __pos, const _CharT* __s, size_type __n)

{

__glibcxx_requires_string_len(__s, __n);

_M_check(__pos, "basic_string::insert");

if (this->max_size() - this->size() < __n)

__throw_length_error(__N("basic_string::insert"));

//照樣判斷是否需要重新分配記憶體

if(_M_rep()->_M_is_shared() || less<const _CharT*>()(__s, _M_data())|| less<const _CharT*>()(_M_data() + this->size(), __s))

return _M_replace_safe(__pos, size_type(0), __s, __n);

else

{

//兩個串有重疊,在原始碼中有一段註釋，說明了為什麼引入和臨時變數__off

//如果是你第一次寫這樣的程式碼，不知道你是否能考慮到？？

//由於_M_mutate函式可能會重新分配記憶體，也就說字串實際的位置可能發生變化，而在這段程式碼中__s和_M_data()實際上有重疊，那麼當_M_data()實際所指的c_style字串發生變化，__s也就會失效，所以引入臨時變數，儲存他們之間的相對距離，然後在_M_mutate函式執行後重新找到字串__s。

const size_type __off = __s - _M_data();

_M_mutate(__pos, 0, __n);

__s = _M_data() + __off;

_CharT* __p = _M_data() + __pos;

//被插入的子串末端在__p之前，直接拷貝

if (__s + __n <= __p)

traits_type::copy(__p, __s, __n);

//被插入的子串始端在__p之後，直接拷貝

else if (__s >= __p)

traits_type::copy(__p, __s + __n, __n);

else

{

//被插入的子串和插入子串位置重疊，需要小心，防止覆蓋原來字元

//不過這裡的演算法也算是奇怪，居然是從__S開始計算__n個字元，但是中間吆除去__P開頭__n個字元。如下圖所示：

__s

__p

__nleft

n - __nleft

const size_type __nleft = __p - __s;

traits_type::copy(__p, __s, __nleft);

traits_type::copy(__p + __nleft, __p + __n, __n - __nleft);

}

return *this;

}

返回值為basic_string&的insert函式共有5個，其中4個都是借用呼叫上面的實現。還有一個是呼叫_M_replace_aux函式完成的是插入__n2個字元__C。

2. 返回值為void的insert函式

void

insert(iterator __p, size_type __n, _CharT __c)

{

this->replace(__p, __p, __n, __c);

}

呼叫的repalce函式。呼叫的是replace(iterator __i1, iterator __i2, const basic_string& __str)這樣的函式，最後還是轉化為呼叫上面描述的replace函式。這樣的函式有兩個。

3. 返回值為iterator的insert函式

iterator

insert(iterator __p, _CharT __c)

{

_GLIBCXX_DEBUG_PEDASSERT(__p >= _M_ibegin() && __p <= _M_iend());

const size_type __pos = __p - _M_ibegin();

_M_replace_aux(__pos, size_type(0), size_type(1), __c);

//很是抱歉，我沒有看明白這樣設計的目的。

//我的猜測是這樣的，由於這個函式返回的是iterator，防止在insert以後和其他string物件共享，當其他string物件重新分配記憶體之後，這個返回值iterator就是一個無效值。

//因此就設定這樣的標誌，表示該string物件不能被共享的。

_M_rep()->_M_set_leaked();

return this->_M_ibegin() + __pos;

}

這樣的函式只有一個。插入一個字元，返回插入的位置。

Erase函式：

1. 返回值為basic_string&的erase函式

basic_string&

erase(size_type __pos = 0, size_type __n = npos)

{

return _M_replace_safe(_M_check(__pos, "basic_string::erase"), _M_limit(__pos, __n), NULL, size_type(0));

}

2. 返回值為iterator的erase函式

iterator

erase(iterator __position)

{

_GLIBCXX_DEBUG_PEDASSERT(__position >= _M_ibegin()&& __position < _M_iend());

const size_type __pos = __position - _M_ibegin();

_M_replace_safe(__pos, size_type(1), NULL, size_type(0));

_M_rep()->_M_set_leaked();

return _M_ibegin() + __pos;

}

iterator

erase(iterator __first, iterator __last)

{

_GLIBCXX_DEBUG_PEDASSERT(__first >= _M_ibegin() && __first <= __last && __last <= _M_iend());

const size_type __pos = __first - _M_ibegin();

_M_replace_safe(__pos, __last - __first, NULL, size_type(0));

_M_rep()->_M_set_leaked();

return _M_ibegin() + __pos;

}

前面已經介紹過replace_safe函式，所以erase函式無須再介紹了。值得注意的仍然是在兩個返回值為iterator的erase函式中在執行replace_safe函式後也有設定string物件為資源洩露標誌，我在此處的推測仍然是和前面的推測保持一致。

9. Operator[]函式

Const函式：

const_reference

operator[] (size_type __pos) const

{

_GLIBCXX_DEBUG_ASSERT(__pos <= size());

return _M_data()[__pos];

}

非常簡單，直接返回資料，並且使用const_conference接受字元物件，這是一個const point不能修改字元。

Non-Const函式：

reference

operator[](size_type __pos)

{

_GLIBCXX_DEBUG_ASSERT(__pos < size());

///首先是否需要重新分配記憶體，然後設定記憶體洩露標誌，也就是有_M_rep()->_M_set_leaked();的語句

_M_leak();

return _M_data()[__pos];

}

對_M_rep()->_M_set_leaked()推測仍然是和前面的推測保持一致。

SGI STL中string的原始碼解讀（3）

7. replace函式

1. _M_mutate函式

2. _M_replace_safe函式

3. _M_replace函式

4. _M_replace_aux函式

5. replace函式小結

8. insert和erase函式

Insert函式：

1. 返回值為basic_string&的insert函式

2. 返回值為void的insert函式

3. 返回值為iterator的insert函式

Erase函式：

1. 返回值為basic_string&的erase函式

2. 返回值為iterator的erase函式

9. Operator[]函式

Const函式：

Non-Const函式：

SGI STL中string的原始碼解讀（3）

以太坊原始碼解讀（3）以太坊啟動流程簡析

原始碼解讀（一）：String類

Netty原始碼解讀（二）Netty中的buffer

SGI STL內存配置器（一）：內存泄漏？

python---django中orm的使用（3）admin配置與使用

Mybatis原始碼分析（3）—— 從Mybatis的視角去看Bean的初始化流程

PackageManagerService 原始碼分析（3） ApplicationInfo 相關

資料庫路由中介軟體MyCat - 原始碼篇（3）

以太坊原始碼解讀（5）BlockChain類的解析及NewBlockChain()分析

以太坊原始碼解讀（4）Block類及其儲存

以太坊原始碼解讀（6）blockchain區塊插入和校驗分析

Java原始碼系列（3）:列舉型別

以太坊原始碼解讀（7）以太坊的P2P網路基礎

ORB-SLAM2原始碼解讀（1）：系統入口System

3---Django rest framework原始碼分析（3）----節流

Dubbo原始碼理解（3）消費者呼叫過程

React原始碼解析（3）：元件的生命週期

vue原始碼解讀（一）

Netflix Eureka原始碼分析（3）——listener（EurekaBootStrap監聽類）分析

SGI STL中string的原始碼解讀（3）

7. replace函式

1. _M_mutate函式

2. _M_replace_safe函式

3. _M_replace函式

4. _M_replace_aux函式

5. replace函式小結

8. insert和erase函式

Insert函式：

1. 返回值為basic_string&的insert函式

2. 返回值為void的insert函式

3. 返回值為iterator的insert函式

Erase函式：

1. 返回值為basic_string&的erase函式

2. 返回值為iterator的erase函式

9. Operator[]函式

Const函式：

Non-Const函式：

相關推薦