1. 程式人生 > >C++:[STL]淺談Allocator以及詳解STL之sequence container的操作及使用(vector)

C++:[STL]淺談Allocator以及詳解STL之sequence container的操作及使用(vector)

// 2016-05-13 補充Allocator(空間配置器)的介紹
// 2016-05-14 補充vector的一些零散的知識(第二部分的“補充部分”);
// 補充建構函式fill(2)的注意事項
// 補充vector內元素的比較(<, <=, >, >=, ==, !=)

1.引入

STL,即 standard tempalate library,標準模板庫,是C++的重要組成部分。C++ STL(標準模板庫)是一套功能強大的 C++ 模板類,提供了通用的模板類和函式,這些模板類和函式可以實現多種流行和常用的演算法和資料結構,如向量、連結串列、佇列、棧。

STL的構成:

組成部分 描述
iterator(迭代器) 迭代器用於遍歷物件集合的元素。
container(容器) 容器是用來管理某一類物件的集合。
Generic algorithm(泛型演算法) 演算法作用於容器。它們提供了執行各種操作的方式,包括對容器內容執行初始化、排序、搜尋和轉換等操作。

容器的構成:

分類 舉例
Sequential Container(順序容器) vector, list, deque
Associative Container (關聯容器) map, multimap, set, multiset
Container Adapter (容器介面卡) stack, queue

2.vector類介紹

According to cplusplus.com,

Vectors are sequence containers representing arrays that can change in size.

Just like arrays, vectors use contiguous storage locations for their elements, which means that their elements can also be accessed using offsets on regular pointers to its elements, and just as efficiently as in arrays.* But unlike arrays, their size can change dynamically, with their storage being handled automatically by the container*

.

Internally, vectors use a dynamically allocated array to store their elements. This array may need to be reallocated in order to grow in size when new elements are inserted, which implies allocating a new array and moving all elements to it. This is a relatively expensive task in terms of processing time, and thus, vectors do not reallocate each time an element is added to the container.

Instead, vector containers may allocate some extra storage to accommodate for possible growth, and thus the container may have an actual capacity greater than the storage strictly needed to contain its elements (i.e., its size). Libraries can implement different strategies for growth to balance between memory usage and reallocations, but in any case, reallocations should only happen at logarithmically growing intervals of size so that the insertion of individual elements at the end of the vector can be provided with amortised constant time complexity (see push_back).

Therefore, compared to arrays, vectors consume more memory in exchange for the ability to manage storage and grow dynamically in an efficient way.

Compared to the other dynamic sequence containers (deques, lists and forward_lists), vectors are very efficient accessing its elements (just like arrays) and relatively efficient adding or removing elements from its end. For operations that involve inserting or removing elements at positions other than the end, they perform worse than the others, and have less consistent iterators and references than lists and forward_lists.

分析:

(1)vector是array的升級版

vector是一個sequence容器,是array的升級版,主要因為vector能高效地對記憶體進行管理以及動態增長。但是,vector其實就是將array和方法封裝形成的一個類。

(2)vector容器的記憶體管理

vector的容器大小可以動態增長,但是並不意味著每一次插入操作都進行reallocate。記憶體的分配與釋放耗費的資源是比較大的,因此應當減少它的次數。與此同時,這也就意味著容器的容量(capacity)與容器目前容納的大小(size)是不等的,前者應大於後者。

補充:上文也說了,對於vector記憶體的增長有很多strategies,例如第一次增長我增加2個記憶體單位,第二次增長我增加4個記憶體單位,第三次,第四次,… 第n次我增長2^n(前提是vector記憶體不夠時再分配),這樣也就大大減少了reallocate的次數。

(3)vector容器的意義

然後,vector雖然相對於array消耗了更多的記憶體,但是卻實現了對記憶體的高效管理和增長,這種消耗是值得的。

(4)vector容器與其他容器的比較

與其他的sequence容器比較(如list,dequeue), vector訪問元素的效率較高,但是對於增加與刪除操作就不如其他兩個容器了。

補充:

(5)vector的容納物件

vector能容納絕大多數型別的物件作為它的元素,但是因為引用不是物件,所以不存在包含引用的vector,即引用不能作為vector的物件。並且,vectot容納的物件也可以是vector,但是有一點要注意,便是在這種情況下vector容器定義的格式:

// 早期c++標準
vector<vector<int> > // 倒數兩個尖括號之間要加空格
// c++11
vector<vector<int>>  // 倒數兩個尖括號之間無需加空格

時間複雜度:

container access insert or erase
vector O(1) O(n^2)
list O(n) O(n)
dequeue O(n) O(n)

雖然list和deque時間複雜度相同,但是因為空間複雜度不同,二者效率也是不同的。

到這裡,在對vector容器有一定了解之後,我們開始接觸vector容器的各種操作。

3.vector類常用操作

注:列操作清單,詳解在後文。

操作 作用
(constructor) Construct vector
(destructor) Vector destructor
operator= Assign content
begin Return iterator to beginning
end Return iterator to end
rbegin Return reverse iterator to reverse beginning
rend Return reverse iterator to reverse end
cbegin (c++11) Return const_iterator to beginning
cend (c++11) Return const_iterator to end
crbegin (c++11) Return const_reverse_iterator to reverse beginning
crend (c++11) Return const_reverse_iterator to reverse end
size Return size
max_size Return maximum size
resize Change size
capacity Return size of allocated storage capacity
empty Test whether vector is empty
reserve Request a change in capacity
shrink_to_fit (c++11) Shrink to fit
operator[] Access element
at Access element
front Access first element
back Access last element
data (c++11) Access data
assign Assign vector content
push_back Add element at the end
pop_back Delete last element
insert Insert elements
erase Erase elements
swap Swap content
clear Clear content
emplace (c++11) Construct and insert element
emplace_back (c++11) Construct and insert element at the end
get_allocator Get allocator

以上是cplusplus.com所羅列的,現在我要補充兩部分內容:
(1)vector非呼叫建構函式的初始化;

vector<int> Vector = {1, 2, 3, 4, 5};
// or
vector<int> Vector{1, 2, 3, 4, 5};
// compare
vector<int> Vector(10);
vector<int> Vector(10, 5);

我們可以這樣理解:如果vector物件後帶的是圓括號,則呼叫vector的建構函式;如果是花括號優先考慮列表初始化。如果發現型別不匹配無法構成列表初始化,則再考慮呼叫建構函式。
如:

vector<string> Vector{10};
vector<string> Vecytor{10, "hi"};

此時,由於型別不匹配,呼叫vector建構函式。
(2)vector物件的比較
首先,vector物件能夠比較是基於vector元素可比。
vector容器相等(==)的條件是:容器內元素的數量相同,且對應位置的內容也相同。
顯然不等(!=)相等(==)的條件是對立的。
vector的大小關係:如果其中一個容器中的所有元素(順序與另一個容器一致)恰好是另外一個容器元素前面一段,那麼前者小於後者;否則,則比較第一次出現不對應相等的元素,小者則小,大者則大。(其實有點類似於字典序,即字串的比較)

4.vector容器操作詳解

(1)Constructor

// default (1)  
explicit vector (const allocator_type& alloc = allocator_type());
// fill (2) 
explicit vector (size_type n, const value_type& val = value_type(),
                 const allocator_type& alloc = allocator_type());
// range (3)    
template <class InputIterator>
         vector (InputIterator first, InputIterator last,
                 const allocator_type& alloc = allocator_type());
// copy (4) 
vector (const vector& x);

建構函式在物件建立時呼叫,實現記憶體的動態分配及容器的初始化。
分析:
default(1): 預設建構函式
fill(2):建構函式,構造一個大小為n的vector容器,每個元素賦值為val,即構造大小為n的容器,並用val充滿
注意:當容器內的元素不支援預設初始化時,該建構函式會出錯。在這種情況下應當提供元素的初始值。
fill(3): 傳入兩個迭代器物件(或為指標,其實iterator的本質就是指標),將二者間的內容拷貝到vector中(拷貝前會構造對應大小的容器)
注意:拷貝的範圍:[first, last)
copy(4):複製建構函式。傳入vector物件,進行拷貝

補充:

alloc
Allocator object.
The container keeps and uses an internal copy of this allocator.
Member type allocator_type is the internal allocator type used by the container, defined in vector as an alias of its second template parameter (Alloc).
If allocator_type is an instantiation of the default allocator (which has no state), this is not relevant.

vector內部有自己的allocator,能夠實現動態記憶體的分配與釋放,所以原始碼中一般不會直接使用new和delete,這樣使得記憶體分配與釋放更加安全。

// copy (1) 
 vector& operator= (const vector& x);

copy1: 過載”=”運算子,實現vector物件的拷貝。在原始碼實現中,要避免記憶體洩漏。

(2)Destructor

// destruct
~vector();

解構函式,在物件銷燬時呼叫,能夠實現記憶體的釋放,避免記憶體洩漏。

(3)Iterators

// begin()
      iterator begin();
const_iterator begin() const;

begin():返回指向第一個元素的迭代器。此處既能返回iterator也能返回const_iterator,取決於vector物件的屬性。如果vector物件是const,那麼返回const_iterator;如果vactor物件不是const,那麼返回iterator。
注意:要區分begin()與front()的區別,begin()返回迭代器物件,front()返回容器內的元素物件。

在c++11標準中新增加了cbegin(),用於返回const型別的iterator。

const_iterator cbegin() const noexcept;

A const_iterator is an iterator that points to const content. This iterator can be increased and decreased (unless it is itself also const), just like the iterator returned by vector::begin, but it cannot be used to modify the contents it points to, even if the vector object is not itself const.

If the container is empty, the returned iterator value shall not be dereferenced.

注:
(1)當vector物件不為const時,const_iterator物件自身可以遞增或遞減;
但是不能對迭代器指向的物件進行修改;當vector物件為const時,const_iterator物件即不能遞增遞減,也不能對指向物件進行修改。
(2)如果容器為空,begin()與end()返回的物件相同(cbegin()也是如此),即指向一個非法的空間。

// rbegin()
      reverse_iterator rbegin();
const_reverse_iterator rbegin() const;

rbegin():返回一個指向最後一個元素的迭代器物件。

同樣,c++11中專門定義了一個返回const_iterator物件的函式crbegin()。

// crbegin()
const_reverse_iterator crbegin() const noexcept;

crbegin():返回指向最後一個元素的const_iterator物件。

// end()
      iterator end();
const_iterator end() const;

end():返回一個指向緊接在最後一個元素之後的虛擬元素(實際不存在),專業的說法就是past-the-end element。

The past-the-end element is the theoretical element that would follow the last element in the vector. It does not point to any element, and thus shall not be dereferenced.

Because the ranges used by functions of the standard library do not include the element pointed by their closing iterator, this function is often used in combination with vector::begin to specify a range including all the elements in the container.

同樣,c++11也定義了一個返回const_interator物件的函式cend()

// cend()
const_iterator cend() const noexcept;

cend():返回一個指向const型別的past-the-end element的迭代器。

// rend()
reverse_iterator rend();
const_reverse_iterator rend() const;

rend():返回一個指向位於第一個元素之前的虛擬元素。

Returns a reverse iterator pointing to the theoretical element preceding the first element in the vector (which is considered its reverse end).

同樣,c++11中定義了返回const_iterator物件的函式crend()。

// crend()
const_reverse_iterator crend() const noexcept;

crend():返回一個const型別的指向位於第一個元素之前的虛擬元素。

Returns a const_reverse_iterator pointing to the theoretical element preceding the first element in the container (which is considered its reverse end).

總結: begin()與end()搭配使用可以實現正向遍歷,rbegin()與rend()搭配使用可以實現逆向遍歷。

(4)Capacity

// size()
size_type size() const;

size(): 返回容器內元素的數目(unsigned int物件)

// max_size()
size_type max_size() const;

max_size():返回該該容器能容納元素的最大數量(unsigned int物件)

也許你會疑惑,max_size()與capacity()的功能是相似的,為什麼不刪去其中一個?
cplusplus.com給出的解釋是:max_size是容器大小理論上的一個限制,而capacity是容器大小實際上的一個限制。

// capacity()
size_type max_size() const;

capacity(): 返回動態分配的記憶體大小

Notice that this capacity does not suppose a limit on the size of the vector. When this capacity is exhausted and more is needed, it is automatically expanded by the container (reallocating it storage space). The theoretical limit on the size of a vector is given by member max_size.

// resize()
void resize (size_type n, value_type val = value_type());

resize():
將容器大小變為n,下面分三種情況討論:
(1)如果size > n,那麼只保留前n個元素;
(2)如果size < n, 那麼多出的部分用val填充;
(3)如果n > max_size, 那麼重新分配記憶體,多出的部分用val填充。
注意:當n < max_size時,不發生記憶體的重新分配,只改變size大小。若size > n,則使用allocator中的destroy函式,將超出的部分刪去。即當n < max_size時,resize()改變的是size而不是max_size。

// empty()
bool empty() const;

empty(): 判斷容器是否為空(size 是否為 0)。若為空,返回true;若不為空,返回false。

// reserve()
void reserve (size_type n);

reserve():如果容器的capacity小於n,便會重新分配記憶體使其capacity達到n;如果capacity 大於等於n,便不會進行任何操作,也就是說,不會影響容器的capacity。

Requests that the vector capacity be at least enough to contain n elements.

// shrink_to_fit()
void shrink_to_fit();

shrink_to_fit():使capacity縮減至size。

Requests the container to reduce its capacity to fit its size.

(5)Element access

// operator []()
      reference operator[] (size_type n);
const_reference operator[] (size_type n) const;

operator :過載[]運算子。返回處於位置n的元素的引用。

// at()
      reference at (size_type n);
const_reference at (size_type n) const;

at(): 返回位置n的元素的引用。

此時你便會感到疑惑,[]與at()有相同的作用,為何不刪除其中一個呢?其實二者是有區別的。[]沒有對邊界限制,即使用[]有可能出現越界的情況;而at()有對邊界的限制,一旦出現越界,便會丟擲異常。

A similar member function, vector::at, has the same behavior as this operator function, except that vector::at is bound-checked and signals if the requested position is out of range by throwing an out_of_range exception.

As for []:
Portable programs should never call this function with an argument n that is out of range, since this causes undefined behavior.

// front()
      reference front();
const_reference front() const;

front():返回第一個元素的引用

Calling this function on an empty container causes undefined behavior.
因此在呼叫front()之前要先確認容器不為空,呼叫back()也是如此。

// back()
      reference back();
const_reference back() const;

back(): 返回容器內最後一個元素的引用。

c++11中還定義了一個返回指向內部陣列的指標的函式data():

// data()
      value_type* data() noexcept;
const value_type* data() const noexcept;

data():返回一個指向vector內部陣列的指標。

Returns a direct pointer to the memory array used internally by the vector to store its owned elements.

(6)Modifiers

// assign()
// range (1)    
template <class InputIterator>
  void assign (InputIterator first, InputIterator last);
// fill (2) 
void assign (size_type n, const value_type& val);

range(1):傳入兩個迭代器物件,將其之間的內容賦值給vector容器,類似於建構函式中的vector (InputIterator first, InputIterator last, const allocator_type& alloc = allocator_type());
fill(2):將其size變為n,並全部賦值為val。
注意:當傳入的元素數量大於capacity,便會reallocate;
當傳入的元素數量小於capacity,便不會reallocate,只會影響size的大小。

// push_back()
void push_back (const value_type& val);

push_back():在容器尾部插入val(此處的尾部指最後一個元素之後的位置)

注意: vector容器沒有push_front()函式

// pop_back()
void pop_back();

pop_back:彈出最後一個元素,即刪去。
注意: vector中也沒有pop_front()函式。

// insert()

// single element (1)   
iterator insert (iterator position, const value_type& val);
// fill (2) 
    void insert (iterator position, size_type n, const value_type& val);
// range (3)    
template <class InputIterator>
    void insert (iterator position, InputIterator first, InputIterator last);

single element(1):迭代器所指向元素的前方插入val
fill(2):迭代器所指向元素的前方插入n個val
range(3):迭代器所指向元素的前方插入[first, last)之間的元素。

// erase()
iterator erase (iterator position);
iterator erase (iterator first, iterator last);

erase(): 刪除迭代器所指向的元素或者迭代器所指向的區間[first,last)的元素,並返回指向被刪除元素的下一個元素的迭代器或者指向被刪除區間最後一個 元素的下一個元素的迭代器。

An iterator pointing to the new location of the element that followed the last element erased by the function call. This is the container end if the operation erased the last element in the sequence.
當最後一個元素被刪除時,其返回值與end()相同。

雖然vector沒有push_front()和pop_front(),但這並不意味著vector不能實現從頭部插入和從頭部刪除,insert()和erase()就能分別從頭插入和從頭刪除,只是效率不大高。

// swap()
void swap (vector& x);

swap(): 實現兩個型別相同的vector物件中內容的交換。

Exchanges the content of the container by the content of x, which is another vector object of the same type. Sizes may differ.

After the call to this member function, the elements in this container are those which were in x before the call, and the elements of x are those which were in this. All iterators, references and pointers remain valid for the swapped objects.

// clear()
void clear();

clear():將容器內元素清空並把size置為0。
注意: clear()函式並不是將vector物件申請的記憶體釋放,它只起到清空的作用。

// emplace()
template <class... Args>
iterator emplace (const_iterator position, Args&&... args);
// emplace_back()
template <class... Args>
  void emplace_back (Args&&... args);

emplace() and emplace_back()都是實現元素的插入,前者插入指定位置,與insert()功能相似,後者插入尾部。

這兩個方法是在c++11中定義的,呼叫這兩個函式的插入效率高於insert()和push_back(),因為他們避免了記憶體的拷貝或移動。

emplace_back avoids the extra copy or move operation required when using push_back.

(7)Allocator

In C++ computer programming, allocators are an important component of the C++ Standard Library. The standard library provides several data structures, such as list and set, commonly referred to as containers. A common trait among these containers is their ability to change size during the execution of the program. To achieve this, some form of dynamic memory allocation is usually required. Allocators handle all the requests for allocation and deallocation of memory for a given container. The C++ Standard Library provides general-purpose allocators that are used by default, however, custom allocators may also be supplied by the programmer.

allocator, 即空間配置器,用於實現記憶體的動態分配與釋放。那麼為什麼在vector中定義allocator而不直接使用new和delete呢
原因便是減少開銷。我們知道,new和delete申請與釋放記憶體的開銷是比較大的。如果多次new與delete會使程式的效率大大降低。這時開發者很聰明,定義了一個allocator來實現記憶體的管理。那它是如何減少開銷的呢?

組成 功能
第一級配置器 通過malloc()與free()來申請記憶體的分配與釋放。能模擬C++的set_new_handler()以處理記憶體不足的情況
第二級配置器 維護16個自由連結串列(free lists),負責16種小型區塊的次配置能力。記憶體池(memory pool)以malloc()配置而得,如果記憶體不足,轉呼叫第一級配置器。如果需求區塊大於128bytes,就轉呼叫第一級介面卡

allocator執行原理:
當需要申請記憶體時,優先呼叫第二級配置器。若第二級配置器無法滿足申請的需求,則轉為呼叫第一級配置器。對於第二級配置器,管理著一個memory pool(記憶體池),每個記憶體池由16條自由連結串列構成,同一條連結串列上結點所佔的記憶體大小是相同的,不同連結串列上結點所佔的記憶體一般不同。當申請記憶體且第二級配置器能夠滿足時,就從連結串列中取出一塊用於分配,也就是刪除結點;當需要釋放記憶體時,便將該記憶體重新插回連結串列,也就實現了記憶體的回收。

這樣一來,當定義一個vector物件時,只需要動態分配一次記憶體和釋放一次記憶體。對於類內記憶體的申請與釋放,只需對指標指向操作。這樣一來,也就大大減少了開銷,提高了vector的效率。

下面增加幾張圖片,以幫助理解:(圖片轉載,並非原創)
Allocator1
Allocator2
Allocator3
Allocator4
Allocator5
Allocator6

// get_allocator
allocator_type get_allocator() const;

get_allocator:返回vector中allocator的拷貝物件

Returns a copy of the allocator object associated with the vector.

cplusplus.com給出的例子:

// vector::get_allocator
#include <iostream>
#include <vector>

int main ()
{
  std::vector<int> myvector;
  int * p;
  unsigned int i;

  // allocate an array with space for 5 elements using vector's allocator:
  p = myvector.get_allocator().allocate(5);

  // construct values in-place on the array:
  for (i=0; i<5; i++) myvector.get_allocator().construct(&p[i],i);

  std::cout << "The allocated array contains:";
  for (i=0; i<5; i++) std::cout << ' ' << p[i];
  std::cout << '\n';

  // destroy and deallocate:
  for (i=0; i<5; i++) myvector.get_allocator().destroy(&p[i]);
  myvector.get_allocator().deallocate(p,5);

  return 0;
}

Output:
The allocated array contains: 0 1 2 3 4

以上內容皆為本人觀點,歡迎大家提出批評和指導,我們一起探討!