# 三十二 ostream
繼續以一個hello world程序為例,但是這次使用ostream
```
#include <iostream>
int main()
{
std::cout << "Hello, world!\n";
}
```
幾乎所有關于c++的書籍都會提到<<操作支持很多數據類型。這些支持是在ostream中完成的。通過反匯編之后的代碼,可以看到ostream中的<<操作被調用:
```
$SG37112 DB ’Hello, world!’, 0aH, 00H
_main PROC
push OFFSET $SG37112
push OFFSET ?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::cout
call ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?
$char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
add esp, 8
xor eax, eax
ret 0
_main ENDP
```
對示例程序做如下修改:
```
#include <iostream>
int main()
{
std::cout << "Hello, " << "world!\n";
}
```
同樣的,從許多C++書籍中可以知道,ostream的輸出操作的運算結果可以用作下一次輸出(即ostream的輸出操作返回ostream對象)。
```
$SG37112 DB ’world!’, 0aH, 00H
$SG37113 DB ’Hello, ’, 00H
_main PROC
push OFFSET $SG37113 ; ’Hello, ’
push OFFSET ?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::cout
call ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?
$char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
add esp, 8
push OFFSET $SG37112 ; ’world!’
push eax ; result of previous function
call ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?
$char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
add esp, 8
xor eax, eax
ret 0
_main ENDP
```
如果用函數f()替換<<運算符,示例代碼等價于:
```
f(f(std::cout, "Hello, "), "world!")
```
通過GCC生成的代碼和MSVC的代碼基本相同。
引用: 在c++中,引用和指針一樣,但是使用的時候更安全,因為在使用引用的時候幾乎不會發生錯誤。例如,引用必須始終指向一個相應類型的對象,而不能為NULL。甚至于,引用不能被改變,不能將一個對象的引用重新賦值以指向另一個對象。 如果我們嘗試修改指針的例子(9),將指針替換為引用:
```
void f2 (int x, int y, int & sum, int & product)
{
sum=x+y;
product=x*y;
};
```
可以想到,編譯后的代碼和使用指針生成的代碼一致。
```
_x$ = 8 ; size = 4
_y$ = 12 ; size = 4
_sum$ = 16 ; size = 4
_product$ = 20 ; size = 4
?f2@@YAXHHAAH0@Z PROC ; f2
mov ecx, DWORD PTR _y$[esp-4]
mov eax, DWORD PTR _x$[esp-4]
lea edx, DWORD PTR [eax+ecx]
imul eax, ecx
mov ecx, DWORD PTR _product$[esp-4]
push esi
mov esi, DWORD PTR _sum$[esp]
mov DWORD PTR [esi], edx
mov DWORD PTR [ecx], eax
pop esi
ret 0
?f2@@YAXHHAAH0@Z ENDP ; f2
```
## STL
注意:本章所有這些例子只在32位環境下進行了驗證,沒有在x64環境下驗證
```
std::string
```
內部實現 許多string庫的實現結構包含一個指向字符串緩沖區的指針,一個包含當前字符串長度的變量以及一個表示當前字符串緩沖區大小的變量。為了能夠將緩沖區指針傳遞給使用ASCII字符串的函數,通常string緩沖區中的字符串以0結尾。 C++標準中沒有規定std::string應該如何實現,因此通常是按照上述方式實現的。 按照規定,std::string應該是一個模板而不是類,以便能夠支持不同的字符類型,如char、wchar_t等。
對于std::string,MSVC和GCC中的內部實現存在差異,下面依次進行說明
## MSVC
MSVC的實現中,字符串存儲在適當的位置,不一定位于指針指向的緩沖區(如果字符串的長度小于16個字符)。 這意味著短的字符串在32位環境下至少占據16+4+4=24字節的空間,在64位環境下至少占據16+8+8=32字節,當字符串長度大于16字符時,相應的需要增加字符串自身的長度。
```
#include <string>
#include <stdio.h>
struct std_string
{
union
{
char buf[16];
char* ptr;
} u;
size_t size; // AKA ’Mysize’ in MSVC
size_t capacity; // AKA ’Myres’ in MSVC
};
void dump_std_string(std::string s)
{
struct std_string *p=(struct std_string*)&s;
printf ("[%s] size:%d capacity:%d\n", p->size>16 ? p->u.ptr : p->u.buf, p->size, p->
capacity);
};
int main()
{
std::string s1="short string";
std::string s2="string longer that 16 bytes";
dump_std_string(s1);
dump_std_string(s2);
// that works without using c_str()
printf ("%s\n", &s1);
printf ("%s\n", s2);
};
```
通過源代碼可以清晰的看到這些。 如果字符串長度小于16個符號,存儲字符的緩沖區不需要在堆上分配。實際上非常適宜這樣做,因為大量的字符串確實都較短。顯然,微軟的開發人員認為16個字符是好的臨界點。 在main函數尾部,雖然沒有使用c_str()方法,但是如果編譯運行上面的代碼,所有字符串都將打印在控制臺上。 當字符串的長度小于16個字符時,存儲字符串的緩沖區位于std::string對象的開始位置,printf函數將指針當做指向以0結尾的字符數組,因此上述代碼可以正常運行。 第二個超過16字符的字符串的打印方式比較危險,通常程序員犯的錯誤是忘記寫c_str()。這在很長的一段時間不會引起人的注意,直到一個很長的字符串出現,然后程序崩潰。而上述代碼可以工作,因為指向字符串緩沖區的指針位于結構體的開始。
## GCC
GCC的實現中,增加了一個引用計數, 一個有趣的事實是一個指向std::string類實例的指針并不是指向結構體的起始位置,而是指向緩沖區的指針,在libstdc++-v3\include\bits\basic_string.h,中我們可以看到這主要是為了方便調試。
> The reason you want _M_data pointing to the character %array and not the _Rep is so that the debugger can see the string contents. (Probably we should add a non-inline member to get the _Rep for the debugger to use, so users can check the actual string length.)
在我的例子中將考慮這一點:
```
#include <string>
#include <stdio.h>
struct std_string
{
size_t length;
size_t capacity;
size_t refcount;
};
void dump_std_string(std::string s)
{
char *p1=*(char**)&s; // GCC type checking workaround
struct std_string *p2=(struct std_string*)(p1-sizeof(struct std_string));
printf ("[%s] size:%d capacity:%d\n", p1, p2->length, p2->capacity);
};
int main()
{
std::string s1="short string";
std::string s2="string longer that 16 bytes";
dump_std_string(s1);
dump_std_string(s2);
// GCC type checking workaround:
printf ("%s\n", *(char**)&s1);
printf ("%s\n", *(char**)&s2);
};
```
由于GCC有較強的類型檢查,因此需要技巧來隱藏類似之前的錯誤,即使不使用c_str(),printf也能夠正常工作。
更復雜的例子
```
#include <string>
#include <stdio.h>
int main()
{
std::string s1="Hello, ";
std::string s2="world!\n";
std::string s3=s1+s2;
printf ("%s\n", s3.c_str());
}
$SG39512 DB ’Hello, ’, 00H
$SG39514 DB ’world!’, 0aH, 00H
$SG39581 DB ’%s’, 0aH, 00H
_s2$ = -72 ; size = 24
_s3$ = -48 ; size = 24
_s1$ = -24 ; size = 24
_main PROC
sub esp, 72 ; 00000048H
push 7
push OFFSET $SG39512
lea ecx, DWORD PTR _s1$[esp+80]
mov DWORD PTR _s1$[esp+100], 15 ; 0000000fH
mov DWORD PTR _s1$[esp+96], 0
mov BYTE PTR _s1$[esp+80], 0
call ?assign@?$basic_string@DU?$char_traits@D@std@@V?
$allocator@D@2@@std@@QAEAAV12@PBDI@Z ; std::basic_string<char,std::char_traits<char>,std::
allocator<char> >::assign
push 7
push OFFSET $SG39514
lea ecx, DWORD PTR _s2$[esp+80]
mov DWORD PTR _s2$[esp+100], 15 ; 0000000fH
mov DWORD PTR _s2$[esp+96], 0
mov BYTE PTR _s2$[esp+80], 0
call ?assign@?$basic_string@DU?$char_traits@D@std@@V?
$allocator@D@2@@std@@QAEAAV12@PBDI@Z ; std::basic_string<char,std::char_traits<char>,std::
allocator<char> >::assign
lea eax, DWORD PTR _s2$[esp+72]
push eax
lea eax, DWORD PTR _s1$[esp+76]
push eax
lea eax, DWORD PTR _s3$[esp+80]
push eax
call ??$?HDU?$char_traits@D@std@@V?$allocator@D@1@@std@@YA?AV?$basic_string@DU?
$char_traits@D@std@@V?$allocator@D@2@@0@ABV10@0@Z ; std::operator+<char,std::char_traits<char
>,std::allocator<char> >
; inlined c_str() method:
cmp DWORD PTR _s3$[esp+104], 16 ; 00000010H
lea eax, DWORD PTR _s3$[esp+84]
cmovae eax, DWORD PTR _s3$[esp+84]
push eax
push OFFSET $SG39581
call _printf
add esp, 20 ; 00000014H
cmp DWORD PTR _s3$[esp+92], 16 ; 00000010H
jb SHORT $LN119@main
push DWORD PTR _s3$[esp+72]
call ??3@YAXPAX@Z ; operator delete
add esp, 4
$LN119@main:
cmp DWORD PTR _s2$[esp+92], 16 ; 00000010H
mov DWORD PTR _s3$[esp+92], 15 ; 0000000fH
mov DWORD PTR _s3$[esp+88], 0
mov BYTE PTR _s3$[esp+72], 0
jb SHORT $LN151@main
push DWORD PTR _s2$[esp+72]
call ??3@YAXPAX@Z ; operator delete
add esp, 4
$LN151@main:
cmp DWORD PTR _s1$[esp+92], 16 ; 00000010H
mov DWORD PTR _s2$[esp+92], 15 ; 0000000fH
mov DWORD PTR _s2$[esp+88], 0
mov BYTE PTR _s2$[esp+72], 0
jb SHORT $LN195@main
push DWORD PTR _s1$[esp+72]
call ??3@YAXPAX@Z ; operator delete
add esp, 4
$LN195@main:
xor eax, eax
add esp, 72 ; 00000048H
ret 0
_main ENDP
```
編譯器并不是靜態構造string對象,存儲數據的緩沖區是否一定要在堆中呢?通常以0結尾的ASCII字符串存儲在數據節中,然后運行時,通過賦值方法完成s1和s2兩個string對象的構造。通過+操作符,s3完成string對象的構造。 可以注意到上述代碼中并沒有c_str()方法的調用,這是因為由于函數太小,編譯器將其內聯了,如果一個字符串小于16個字符,eax寄存器中存放指向緩沖區的指針,否則,存放指向堆中字符串緩沖區的指針。 然后,我們看到了三個析構函數的調用,當字符串長度超過16字符時,析構函數將被調用,在堆中的緩沖區會被釋放。此外,由于三個std::string對象都存儲在棧中,當函數結束時,他們將被自動釋放。 可以得到一個結論,短的字符串對象處理起來更快,因為堆訪問操作較少。 GCC生成的代碼甚至更簡單(正如我之前提到的,GCC并不將短的字符串存儲在結構體中)
```
.LC0:
.string "Hello, "
.LC1:
.string "world!\n"
main:
push ebp
mov ebp, esp
push edi
push esi
push ebx
and esp, -16
sub esp, 32
lea ebx, [esp+28]
lea edi, [esp+20]
mov DWORD PTR [esp+8], ebx
lea esi, [esp+24]
mov DWORD PTR [esp+4], OFFSET FLAT:.LC0
mov DWORD PTR [esp], edi
call _ZNSsC1EPKcRKSaIcE
mov DWORD PTR [esp+8], ebx
mov DWORD PTR [esp+4], OFFSET FLAT:.LC1
mov DWORD PTR [esp], esi
call _ZNSsC1EPKcRKSaIcE
mov DWORD PTR [esp+4], edi
mov DWORD PTR [esp], ebx
call _ZNSsC1ERKSs
mov DWORD PTR [esp+4], esi
mov DWORD PTR [esp], ebx
call _ZNSs6appendERKSs
; inlined c_str():
mov eax, DWORD PTR [esp+28]
mov DWORD PTR [esp], eax
call puts
mov eax, DWORD PTR [esp+28]
lea ebx, [esp+19]
mov DWORD PTR [esp+4], ebx
sub eax, 12
mov DWORD PTR [esp], eax
call _ZNSs4_Rep10_M_disposeERKSaIcE
mov eax, DWORD PTR [esp+24]
mov DWORD PTR [esp+4], ebx
sub eax, 12
mov DWORD PTR [esp], eax
call _ZNSs4_Rep10_M_disposeERKSaIcE
mov eax, DWORD PTR [esp+20]
mov DWORD PTR [esp+4], ebx
sub eax, 12
mov DWORD PTR [esp], eax
call _ZNSs4_Rep10_M_disposeERKSaIcE
lea esp, [ebp-12]
xor eax, eax
pop ebx
pop esi
pop edi
pop ebp
ret
```
可以看到傳遞給析構函數的并不是一個對象的指針,而是在對象所在位置的前12個字節的位置,也就是結構體的真正起始位置。
## std::string 作為全局變量使用
有經驗的c++程序員會說,可以定義一個STL類型的全局變量。 是的,確實如此
```
#include <stdio.h>
#include <string>
std::string s="a string";
int main()
{
printf ("%s\n", s.c_str());
};
MSVC:
$SG39512 DB ’a string’, 00H
$SG39519 DB ’%s’, 0aH, 00H
_main PROC
cmp DWORD PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A
+20, 16 ; 00000010H
mov eax, OFFSET ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A
; s
cmovae eax, DWORD PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?
$allocator@D@2@@std@@A
push eax
push OFFSET $SG39519
call _printf
add esp, 8
xor eax, eax
ret 0
_main ENDP
??__Es@@YAXXZ PROC ; ‘dynamic initializer for ’s’’, COMDAT
push 8
push OFFSET $SG39512
mov ecx, OFFSET ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A
; s
call ?assign@?$basic_string@DU?$char_traits@D@std@@V?
$allocator@D@2@@std@@QAEAAV12@PBDI@Z ; std::basic_string<char,std::char_traits<char>,std::
allocator<char> >::assign
push OFFSET ??__Fs@@YAXXZ ; ‘dynamic atexit destructor for ’s’’
call _atexit
pop ecx
ret 0
??__Es@@YAXXZ ENDP ; ‘dynamic initializer for ’s’’
??__Fs@@YAXXZ PROC ; ‘dynamic atexit destructor for ’s’’,
COMDAT
push ecx
cmp DWORD PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A
+20, 16 ; 00000010H
jb SHORT $LN23@dynamic
push esi
mov esi, DWORD PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?
$allocator@D@2@@std@@A
lea ecx, DWORD PTR $T2[esp+8]
call ??0?$_Wrap_alloc@V?$allocator@D@std@@@std@@QAE@XZ ; std::_Wrap_alloc<std::
allocator<char> >::_Wrap_alloc<std::allocator<char> >
push OFFSET ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A ; s
lea ecx, DWORD PTR $T2[esp+12]
call ??$destroy@PAD@?$_Wrap_alloc@V?$allocator@D@std@@@std@@QAEXPAPAD@Z ; std::
_Wrap_alloc<std::allocator<char> >::destroy<char *>
lea ecx, DWORD PTR $T1[esp+8]
call ??0?$_Wrap_alloc@V?$allocator@D@std@@@std@@QAE@XZ ; std::_Wrap_alloc<std::
allocator<char> >::_Wrap_alloc<std::allocator<char> >
push esi
call ??3@YAXPAX@Z ; operator delete
add esp, 4
pop esi
$LN23@dynamic:
mov DWORD PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A
+20, 15 ; 0000000fH
mov DWORD PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A
+16, 0
mov BYTE PTR ?s@@3V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@A, 0
pop ecx
ret 0
??__Fs@@YAXXZ ENDP ; ‘dynamic atexit destructor for ’s’’
```
實際上,main函數之前,CRT中將調用一個特殊的函數調用所有全局變量的構造函數。除此之外,CRT中還將通過atexit()注冊另一個函數,該函數中將調用全局變量的析構函數。 GCC生成的代碼如下:
```
main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 16
mov eax, DWORD PTR s
mov DWORD PTR [esp], eax
call puts
xor eax, eax
leave
ret
.LC0:
.string "a string"
_GLOBAL__sub_I_s:
sub esp, 44
lea eax, [esp+31]
mov DWORD PTR [esp+8], eax
mov DWORD PTR [esp+4], OFFSET FLAT:.LC0
mov DWORD PTR [esp], OFFSET FLAT:s
call _ZNSsC1EPKcRKSaIcE
mov DWORD PTR [esp+8], OFFSET FLAT:__dso_handle
mov DWORD PTR [esp+4], OFFSET FLAT:s
mov DWORD PTR [esp], OFFSET FLAT:_ZNSsD1Ev
call __cxa_atexit
add esp, 44
ret
.LFE645:
.size _GLOBAL__sub_I_s, .-_GLOBAL__sub_I_s
.section .init_array,"aw"
.align 4
.long _GLOBAL__sub_I_s
.globl s
.bss
.align 4
.type s, @object
.size s, 4
s:
.zero 4
.hidden __dso_handle
```
GCC并沒有創建單獨的函數來玩全局變量的析構,而是依次將全局變量的析構函數傳遞給atexit()函數
**std::list**
std::list是標準的雙向鏈表,每一個元素包含兩個指針,分別前一個和后一個元素。 這意味著每個元素的的內存空間大小要相應的擴大2個字。(在32位環境下為8字節,64位環境下為16字節) std::list也是一個循環鏈表,也就是說最后一個元素包含一個指向第一個元素的指針,反之亦然。 C++ STL只是為你希望做為鏈表使用的結構體增加前向和后向指針。 下面我們構造一個包含兩個簡單變量的結構體,將其存儲在鏈表中。 雖然C++標準中并沒有規定如何實現list,但是MSVC和GCC的實現比較相似,因此下面僅通過一段源代碼來展示。
```
#include <stdio.h>
#include <list>
#include <iostream>
struct a
{
int x;
int y;
};
struct List_node
{
struct List_node* _Next;
struct List_node* _Prev;
int x;
int y;
};
void dump_List_node (struct List_node *n)
{
printf ("ptr=0x%p _Next=0x%p _Prev=0x%p x=%d y=%d\n",
n, n->_Next, n->_Prev, n->x, n->y);
};
void dump_List_vals (struct List_node* n)
{
struct List_node* current=n;
for (;;)
{
dump_List_node (current);
current=current->_Next;
if (current==n) // end
break;
};
};
void dump_List_val (unsigned int *a)
{
#ifdef _MSC_VER
// GCC implementation doesn’t have "size" field
printf ("_Myhead=0x%p, _Mysize=%d\n", a[0], a[1]);
#endif
dump_List_vals ((struct List_node*)a[0]);
};
int main()
{
std::list<struct a> l;
printf ("* empty list:\n");
dump_List_val((unsigned int*)(void*)&l);
struct a t1;
t1.x=1;
t1.y=2;
l.push_front (t1);
t1.x=3;
t1.y=4;
l.push_front (t1);
t1.x=5;
t1.y=6;
l.push_back (t1);
printf ("* 3-elements list:\n");
dump_List_val((unsigned int*)(void*)&l);
std::list<struct a>::iterator tmp;
printf ("node at .begin:\n");
tmp=l.begin();
dump_List_node ((struct List_node *)*(void**)&tmp);
printf ("node at .end:\n");
tmp=l.end();
dump_List_node ((struct List_node *)*(void**)&tmp);
printf ("* let’s count from the begin:\n");
std::list<struct a>::iterator it=l.begin();
printf ("1st element: %d %d\n", (*it).x, (*it).y);
it++;
printf ("2nd element: %d %d\n", (*it).x, (*it).y);
it++;
printf ("3rd element: %d %d\n", (*it).x, (*it).y);
it++;
printf ("element at .end(): %d %d\n", (*it).x, (*it).y);
printf ("* let’s count from the end:\n");
std::list<struct a>::iterator it2=l.end();
printf ("element at .end(): %d %d\n", (*it2).x, (*it2).y);
it2--;
printf ("3rd element: %d %d\n", (*it2).x, (*it2).y);
it2--;
printf ("2nd element: %d %d\n", (*it2).x, (*it2).y);
it2--;
printf ("1st element: %d %d\n", (*it2).x, (*it2).y);
printf ("removing last element...\n");
l.pop_back();
dump_List_val((unsigned int*)(void*)&l);
};
```
### 34.2.1 GCC
我們以GCC為例開始,當我們運行這個例子,可以看到屏幕上打印出了大量輸出,我們依次對其進行分析。
```
* empty list:
ptr=0x0028fe90 _Next=0x0028fe90 _Prev=0x0028fe90 x=3 y=0
```
我們可以看到一個空的鏈表,它由一個元素組成,其中x和y中包含有垃圾數據。前向和后向指針均指向自身。
這一時刻的.begin和.end迭代器兩者相等。 當我們壓入3個元素之后,鏈表的內部狀態將變為:
```
* 3-elements list:
ptr=0x000349a0 _Next=0x00034988 _Prev=0x0028fe90 x=3 y=4
ptr=0x00034988 _Next=0x00034b40 _Prev=0x000349a0 x=1 y=2
ptr=0x00034b40 _Next=0x0028fe90 _Prev=0x00034988 x=5 y=6
ptr=0x0028fe90 _Next=0x000349a0 _Prev=0x00034b40 x=5 y=6
```
最后一個元素依然位于0x28fe90,只有當鏈表被銷毀的時候它才會移動。x和y中依然包含有垃圾數據。雖然和鏈表中的最后一個元素中的x和y擁有相同的值,但這并不能說明這些值有意義。 下面這幅圖說明了鏈表中的三個元素如何在內存中存儲
變量l始終指向第一個鏈表節點 .begin()和.end()迭代器不指向任何東西,也并不出現在內存中,但是當適當的方法被調用的時候將返回指向這些節點的指針。(這句話是說.begin()和.end()并不是變量,而是函數,該函數返回迭代器。) 鏈表中包含一個垃圾元素在鏈表的實現中是非常流行的方式,如果不包含它,很多操作都將變得復雜和低效。 迭代器實質上是一個指向鏈表節點的指針。list.begin()和list.end()僅僅返回迭代器。
```
node at .begin:
ptr=0x000349a0 _Next=0x00034988 _Prev=0x0028fe90 x=3 y=4
node at .end:
ptr=0x0028fe90 _Next=0x000349a0 _Prev=0x00034b40 x=5 y=6
```
實際上鏈表是循環的非常有用:包含一個指向第一個鏈表元素的指針,類似l變量,這將有利于快速獲取指向最后一個元素的指針,而不必遍歷整個鏈表。這樣也有利于快速的在鏈表尾部插入元素。 --和++操作符只是將當前的迭代器的值設置為current_node->prev、current_node->next。反向迭代器只是以相反的方向做相似的工作。
迭代器的*操作符返回指向節點結構指針,也就是用戶結構的起始位置,如,指向第一個結構體元素x的指針。 List的插入和刪除比較簡單,只需要分配新節點同時給指針設置有效值。 這就是元素被刪除后,迭代器會失效的原因,它依然指向已經被釋放的節點。當然,迭代器指向的被釋放的節點的數據是不能再使用的。
GCC的實現中沒有存儲鏈表的大小,由于沒有其他方式獲取這些信息,.size()方法需要遍歷整個鏈表統計元素個數,因此執行速度較慢。這一操作的時間復雜度為O(n),即消耗時間多少和鏈表中元素個數成正比。
Listing34.7: GCC 4.8.1 -O3 -fno-inline-small-functions
```
main proc near
push ebp
mov ebp, esp
push esi
push ebx
and esp, 0FFFFFFF0h
sub esp, 20h
lea ebx, [esp+10h]
mov dword ptr [esp], offset s ; "* empty list:"
mov [esp+10h], ebx
mov [esp+14h], ebx
call puts
mov [esp], ebx
call _Z13dump_List_valPj ; dump_List_val(uint *)
lea esi, [esp+18h]
mov [esp+4], esi
mov [esp], ebx
mov dword ptr [esp+18h], 1 ; X for new element
mov dword ptr [esp+1Ch], 2 ; Y for new element
call _ZNSt4listI1aSaIS0_EE10push_frontERKS0_ ; std::list<a,std::allocator<a
>>::push_front(a const&)
mov [esp+4], esi
mov [esp], ebx
mov dword ptr [esp+18h], 3 ; X for new element
mov dword ptr [esp+1Ch], 4 ; Y for new element
call _ZNSt4listI1aSaIS0_EE10push_frontERKS0_ ; std::list<a,std::allocator<a
>>::push_front(a const&)
mov dword ptr [esp], 10h
mov dword ptr [esp+18h], 5 ; X for new element
mov dword ptr [esp+1Ch], 6 ; Y for new element
call _Znwj ; operator new(uint)
cmp eax, 0FFFFFFF8h
jz short loc_80002A6
mov ecx, [esp+1Ch]
mov edx, [esp+18h]
mov [eax+0Ch], ecx
mov [eax+8], edx
loc_80002A6: ; CODE XREF: main+86
mov [esp+4], ebx
mov [esp], eax
call _ZNSt8__detail15_List_node_base7_M_hookEPS0_ ; std::__detail::
_List_node_base::_M_hook(std::__detail::_List_node_base*)
mov dword ptr [esp], offset a3ElementsList ; "* 3-elements list:"
call puts
mov [esp], ebx
call _Z13dump_List_valPj ; dump_List_val(uint *)
mov dword ptr [esp], offset aNodeAt_begin ; "node at .begin:"
call puts
mov eax, [esp+10h]
mov [esp], eax
call _Z14dump_List_nodeP9List_node ; dump_List_node(List_node *)
mov dword ptr [esp], offset aNodeAt_end ; "node at .end:"
call puts
mov [esp], ebx
call _Z14dump_List_nodeP9List_node ; dump_List_node(List_node *)
mov dword ptr [esp], offset aLetSCountFromT ; "* let’s count from the begin:"
call puts
mov esi, [esp+10h]
mov eax, [esi+0Ch]
mov [esp+0Ch], eax
mov eax, [esi+8]
mov dword ptr [esp+4], offset a1stElementDD ; "1st element: %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov esi, [esi] ; operator++: get ->next pointer
mov eax, [esi+0Ch]
mov [esp+0Ch], eax
mov eax, [esi+8]
mov dword ptr [esp+4], offset a2ndElementDD ; "2nd element: %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov esi, [esi] ; operator++: get ->next pointer
mov eax, [esi+0Ch]
mov [esp+0Ch], eax
mov eax, [esi+8]
mov dword ptr [esp+4], offset a3rdElementDD ; "3rd element: %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov eax, [esi] ; operator++: get ->next pointer
mov edx, [eax+0Ch]
mov [esp+0Ch], edx
mov eax, [eax+8]
mov dword ptr [esp+4], offset aElementAt_endD ; "element at .end(): %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov dword ptr [esp], offset aLetSCountFro_0 ; "* let’s count from the end:"
call puts
mov eax, [esp+1Ch]
mov dword ptr [esp+4], offset aElementAt_endD ; "element at .end(): %d %d\n"
mov dword ptr [esp], 1
mov [esp+0Ch], eax
mov eax, [esp+18h]
mov [esp+8], eax
call __printf_chk
mov esi, [esp+14h]
mov eax, [esi+0Ch]
mov [esp+0Ch], eax
mov eax, [esi+8]
mov dword ptr [esp+4], offset a3rdElementDD ; "3rd element: %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov esi, [esi+4] ; operator--: get ->prev pointer
mov eax, [esi+0Ch]
mov [esp+0Ch], eax
mov eax, [esi+8]
mov dword ptr [esp+4], offset a2ndElementDD ; "2nd element: %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov eax, [esi+4] ; operator--: get ->prev pointer
mov edx, [eax+0Ch]
mov [esp+0Ch], edx
mov eax, [eax+8]
mov dword ptr [esp+4], offset a1stElementDD ; "1st element: %d %d\n"
mov dword ptr [esp], 1
mov [esp+8], eax
call __printf_chk
mov dword ptr [esp], offset aRemovingLastEl ; "removing last element..."
call puts
mov esi, [esp+14h]
mov [esp], esi
call _ZNSt8__detail15_List_node_base9_M_unhookEv ; std::__detail::
_List_node_base::_M_unhook(void)
mov [esp], esi ; void *
call _ZdlPv ; operator delete(void *)
mov [esp], ebx
call _Z13dump_List_valPj ; dump_List_val(uint *)
mov [esp], ebx
call _ZNSt10_List_baseI1aSaIS0_EE8_M_clearEv ; std::_List_base<a,std::
allocator<a>>::_M_clear(void)
lea esp, [ebp-8]
xor eax, eax
pop ebx
pop esi
pop ebp
retn
main endp
```
Listing 34.8: 所有輸出
```
* empty list:
ptr=0x0028fe90 _Next=0x0028fe90 _Prev=0x0028fe90 x=3 y=0
* 3-elements list:
ptr=0x000349a0 _Next=0x00034988 _Prev=0x0028fe90 x=3 y=4
ptr=0x00034988 _Next=0x00034b40 _Prev=0x000349a0 x=1 y=2
ptr=0x00034b40 _Next=0x0028fe90 _Prev=0x00034988 x=5 y=6
ptr=0x0028fe90 _Next=0x000349a0 _Prev=0x00034b40 x=5 y=6
node at .begin:
ptr=0x000349a0 _Next=0x00034988 _Prev=0x0028fe90 x=3 y=4
node at .end:
ptr=0x0028fe90 _Next=0x000349a0 _Prev=0x00034b40 x=5 y=6
* let’s count from the begin:
1st element: 3 4
2nd element: 1 2
3rd element: 5 6
element at .end(): 5 6
* let’s count from the end:
element at .end(): 5 6
3rd element: 5 6
2nd element: 1 2
1st element: 3 4
removing last element...
ptr=0x000349a0 _Next=0x00034988 _Prev=0x0028fe90 x=3 y=4
ptr=0x00034988 _Next=0x0028fe90 _Prev=0x000349a0 x=1 y=2
ptr=0x0028fe90 _Next=0x000349a0 _Prev=0x00034988 x=5 y=6
```
## 34.2.2 MSVC
MSVC實現形式基本相同,但是它存儲了當前list的大小。這意味著.size()方法非常迅速,只需要從內存中讀取一個值即可。從另一個方面來說,必須在插入和刪除操作后更新size變量的值。 MSVC組織節點的方式也稍有不同。
GCC將垃圾元素放在鏈表的尾部,而MSVC放在鏈表起始位置。
Listing 34.9: MSVC 2012 /Fa2.asm /Ox /GS- /Ob1
```
_l$ = -16 ; size = 8
_t1$ = -8 ; size = 8
_main PROC
sub esp, 16 ; 00000010H
push ebx
push esi
push edi
push 0
push 0
lea ecx, DWORD PTR _l$[esp+36]
mov DWORD PTR _l$[esp+40], 0
; allocate first "garbage" element
call ?_Buynode0@?$_List_alloc@$0A@U?$_List_base_types@Ua@@V?
$allocator@Ua@@@std@@@std@@@std@@QAEPAU?$_List_node@Ua@@PAX@2@PAU32@0@Z ; std::_List_alloc<0,
std::_List_base_types<a,std::allocator<a> > >::_Buynode0
mov edi, DWORD PTR __imp__printf
mov ebx, eax
push OFFSET $SG40685 ; ’* empty list:’
mov DWORD PTR _l$[esp+32], ebx
call edi ; printf
lea eax, DWORD PTR _l$[esp+32]
push eax
call ?dump_List_val@@YAXPAI@Z ; dump_List_val
mov esi, DWORD PTR [ebx]
add esp, 8
lea eax, DWORD PTR _t1$[esp+28]
push eax
push DWORD PTR [esi+4]
lea ecx, DWORD PTR _l$[esp+36]
push esi
mov DWORD PTR _t1$[esp+40], 1 ; data for a new node
mov DWORD PTR _t1$[esp+44], 2 ; data for a new node
; allocate new node
call ??$_Buynode@ABUa@@@?$_List_buy@Ua@@V?$allocator@Ua@@@std@@@std@@QAEPAU?
$_List_node@Ua@@PAX@1@PAU21@0ABUa@@@Z ; std::_List_buy<a,std::allocator<a> >::_Buynode<a
const &>
mov DWORD PTR [esi+4], eax
mov ecx, DWORD PTR [eax+4]
mov DWORD PTR _t1$[esp+28], 3 ; data for a new node
mov DWORD PTR [ecx], eax
mov esi, DWORD PTR [ebx]
lea eax, DWORD PTR _t1$[esp+28]
push eax
push DWORD PTR [esi+4]
lea ecx, DWORD PTR _l$[esp+36]
push esi
mov DWORD PTR _t1$[esp+44], 4 ; data for a new node
; allocate new node
call ??$_Buynode@ABUa@@@?$_List_buy@Ua@@V?$allocator@Ua@@@std@@@std@@QAEPAU?
$_List_node@Ua@@PAX@1@PAU21@0ABUa@@@Z ; std::_List_buy<a,std::allocator<a> >::_Buynode<a
const &>
mov DWORD PTR [esi+4], eax
mov ecx, DWORD PTR [eax+4]
mov DWORD PTR _t1$[esp+28], 5 ; data for a new node
mov DWORD PTR [ecx], eax
lea eax, DWORD PTR _t1$[esp+28]
push eax
push DWORD PTR [ebx+4]
lea ecx, DWORD PTR _l$[esp+36]
push ebx
mov DWORD PTR _t1$[esp+44], 6 ; data for a new node
; allocate new node
call ??$_Buynode@ABUa@@@?$_List_buy@Ua@@V?$allocator@Ua@@@std@@@std@@QAEPAU?
$_List_node@Ua@@PAX@1@PAU21@0ABUa@@@Z ; std::_List_buy<a,std::allocator<a> >::_Buynode<a
const &>
mov DWORD PTR [ebx+4], eax
mov ecx, DWORD PTR [eax+4]
push OFFSET $SG40689 ; ’* 3-elements list:’
mov DWORD PTR _l$[esp+36], 3
mov DWORD PTR [ecx], eax
call edi ; printf
lea eax, DWORD PTR _l$[esp+32]
push eax
call ?dump_List_val@@YAXPAI@Z ; dump_List_val
push OFFSET $SG40831 ; ’node at .begin:’
call edi ; printf
push DWORD PTR [ebx] ; get next field of node $l$ variable points to
call ?dump_List_node@@YAXPAUList_node@@@Z ; dump_List_node
push OFFSET $SG40835 ; ’node at .end:’
call edi ; printf
push ebx ; pointer to the node $l$ variable points to!
call ?dump_List_node@@YAXPAUList_node@@@Z ; dump_List_node
push OFFSET $SG40839 ; ’* let’’s count from the begin:’
call edi ; printf
mov esi, DWORD PTR [ebx] ; operator++: get ->next pointer
push DWORD PTR [esi+12]
push DWORD PTR [esi+8]
push OFFSET $SG40846 ; ’1st element: %d %d’
call edi ; printf
mov esi, DWORD PTR [esi] ; operator++: get ->next pointer
push DWORD PTR [esi+12]
push DWORD PTR [esi+8]
push OFFSET $SG40848 ; ’2nd element: %d %d’
call edi ; printf
mov esi, DWORD PTR [esi] ; operator++: get ->next pointer
push DWORD PTR [esi+12]
push DWORD PTR [esi+8]
push OFFSET $SG40850 ; ’3rd element: %d %d’
call edi ; printf
mov eax, DWORD PTR [esi] ; operator++: get ->next pointer
add esp, 64 ; 00000040H
push DWORD PTR [eax+12]
push DWORD PTR [eax+8]
push OFFSET $SG40852 ; ’element at .end(): %d %d’
call edi ; printf
push OFFSET $SG40853 ; ’* let’’s count from the end:’
call edi ; printf
push DWORD PTR [ebx+12] ; use x and y fields from the node $l$ variable points to
push DWORD PTR [ebx+8]
push OFFSET $SG40860 ; ’element at .end(): %d %d’
call edi ; printf
mov esi, DWORD PTR [ebx+4] ; operator--: get ->prev pointer
push DWORD PTR [esi+12]
push DWORD PTR [esi+8]
push OFFSET $SG40862 ; ’3rd element: %d %d’
call edi ; printf
mov esi, DWORD PTR [esi+4] ; operator--: get ->prev pointer
push DWORD PTR [esi+12]
push DWORD PTR [esi+8]
push OFFSET $SG40864 ; ’2nd element: %d %d’
call edi ; printf
mov eax, DWORD PTR [esi+4] ; operator--: get ->prev pointer
push DWORD PTR [eax+12]
push DWORD PTR [eax+8]
push OFFSET $SG40866 ; ’1st element: %d %d’
call edi ; printf
add esp, 64 ; 00000040H
push OFFSET $SG40867 ; ’removing last element...’
call edi ; printf
mov edx, DWORD PTR [ebx+4]
add esp, 4
; prev=next?
; it is the only element, "garbage one"?
; if yes, do not delete it!
cmp edx, ebx
je SHORT $LN349@main
mov ecx, DWORD PTR [edx+4]
mov eax, DWORD PTR [edx]
mov DWORD PTR [ecx], eax
mov ecx, DWORD PTR [edx]
mov eax, DWORD PTR [edx+4]
push edx
mov DWORD PTR [ecx+4], eax
call ??3@YAXPAX@Z ; operator delete
add esp, 4
mov DWORD PTR _l$[esp+32], 2
$LN349@main:
lea eax, DWORD PTR _l$[esp+28]
push eax
call ?dump_List_val@@YAXPAI@Z ; dump_List_val
mov eax, DWORD PTR [ebx]
add esp, 4
mov DWORD PTR [ebx], ebx
mov DWORD PTR [ebx+4], ebx
cmp eax, ebx
je SHORT $LN412@main
$LL414@main:
mov esi, DWORD PTR [eax]
push eax
call ??3@YAXPAX@Z ; operator delete
add esp, 4
mov eax, esi
cmp esi, ebx
jne SHORT $LL414@main
$LN412@main:
push ebx
call ??3@YAXPAX@Z ; operator delete
add esp, 4
xor eax, eax
pop edi
pop esi
pop ebx
add esp, 16 ; 00000010H
ret 0
_main ENDP
```
與GCC不同,MSVC代碼在函數開始,通過Buynode函數分配垃圾元素,這個方法也用于后續節點的申請(GCC將最早的節點分配在棧上)。
Listing 34.10 完整輸出
```
* empty list:
_Myhead=0x003CC258, _Mysize=0
ptr=0x003CC258 _Next=0x003CC258 _Prev=0x003CC258 x=6226002 y=4522072
* 3-elements list:
_Myhead=0x003CC258, _Mysize=3
ptr=0x003CC258 _Next=0x003CC288 _Prev=0x003CC2A0 x=6226002 y=4522072
ptr=0x003CC288 _Next=0x003CC270 _Prev=0x003CC258 x=3 y=4
ptr=0x003CC270 _Next=0x003CC2A0 _Prev=0x003CC288 x=1 y=2
ptr=0x003CC2A0 _Next=0x003CC258 _Prev=0x003CC270 x=5 y=6
node at .begin:
ptr=0x003CC288 _Next=0x003CC270 _Prev=0x003CC258 x=3 y=4
node at .end:
ptr=0x003CC258 _Next=0x003CC288 _Prev=0x003CC2A0 x=6226002 y=4522072
* let’s count from the begin:
1st element: 3 4
2nd element: 1 2
3rd element: 5 6
element at .end(): 6226002 4522072
* let’s count from the end:
element at .end(): 6226002 4522072
3rd element: 5 6
2nd element: 1 2
1st element: 3 4
removing last element...
_Myhead=0x003CC258, _Mysize=2
ptr=0x003CC258 _Next=0x003CC288 _Prev=0x003CC270 x=6226002 y=4522072
ptr=0x003CC288 _Next=0x003CC270 _Prev=0x003CC258 x=3 y=4
ptr=0x003CC270 _Next=0x003CC258 _Prev=0x003CC288 x=1 y=2
```
## 34.2.3 C++ 11 std::forward_list
和std::list相似,但是包含一個指向下一個節點的指針。它所占用的內存空間更少,但是沒有提供雙向遍歷鏈表的能力。
## 34.3 std::vector
我將std::vector稱作c數組的安全封裝。在內部實現上,它和std::string相似,包含指向緩沖區的指針,指向數組尾部的指針,以及指向緩沖區尾部的指針。 std::vector中的元素在內存中連續存放,和通常中的數組一樣。在C++11包含一個新方法.data(),它返回指向緩沖區的指針,這和std::string中的.c_str()相同。 在堆中分配的緩沖區大小將超過數組自身大小。 MSVC和GCC的實現相似,只是結構體中元素名稍有不同,因此下面的代碼在兩種編譯器上都能工作。下面是用于轉儲std::vector結構的類C代碼。
```
#include <stdio.h>
#include <vector>
#include <algorithm>
#include <functional>
struct vector_of_ints
{
// MSVC names:
int *Myfirst;
int *Mylast;
int *Myend;
// GCC structure is the same, names are: _M_start, _M_finish, _M_end_of_storage
};
void dump(struct vector_of_ints *in)
{
printf ("_Myfirst=%p, _Mylast=%p, _Myend=%p\n", in->Myfirst, in->Mylast, in->Myend);
size_t size=(in->Mylast-in->Myfirst);
size_t capacity=(in->Myend-in->Myfirst);
printf ("size=%d, capacity=%d\n", size, capacity);
for (size_t i=0; i<size; i++)
printf ("element %d: %d\n", i, in->Myfirst[i]);
};
int main()
{
std::vector<int> c;
dump ((struct vector_of_ints*)(void*)&c);
c.push_back(1);
dump ((struct vector_of_ints*)(void*)&c);
c.push_back(2);
dump ((struct vector_of_ints*)(void*)&c);
c.push_back(3);
dump ((struct vector_of_ints*)(void*)&c);
c.push_back(4);
dump ((struct vector_of_ints*)(void*)&c);
c.reserve (6);
dump ((struct vector_of_ints*)(void*)&c);
c.push_back(5);
dump ((struct vector_of_ints*)(void*)&c);
c.push_back(6);
dump ((struct vector_of_ints*)(void*)&c);
printf ("%d\n", c.at(5)); // bounds checking
printf ("%d\n", c[8]); // operator[], no bounds checking
};
如果編譯器是MSVC,下面是輸出樣例。
_Myfirst=00000000, _Mylast=00000000, _Myend=00000000
size=0, capacity=0
_Myfirst=0051CF48, _Mylast=0051CF4C, _Myend=0051CF4C
size=1, capacity=1
element 0: 1
_Myfirst=0051CF58, _Mylast=0051CF60, _Myend=0051CF60
size=2, capacity=2
element 0: 1
element 1: 2
_Myfirst=0051C278, _Mylast=0051C284, _Myend=0051C284
size=3, capacity=3
element 0: 1
element 1: 2
element 2: 3
_Myfirst=0051C290, _Mylast=0051C2A0, _Myend=0051C2A0
size=4, capacity=4
element 0: 1
element 1: 2
element 2: 3
element 3: 4
_Myfirst=0051B180, _Mylast=0051B190, _Myend=0051B198
size=4, capacity=6
element 0: 1
element 1: 2
element 2: 3
element 3: 4
_Myfirst=0051B180, _Mylast=0051B194, _Myend=0051B198
size=5, capacity=6
element 0: 1
element 1: 2
element 2: 3
element 3: 4
element 4: 5
_Myfirst=0051B180, _Mylast=0051B198, _Myend=0051B198
size=6, capacity=6
element 0: 1
element 1: 2
element 2: 3
element 3: 4
element 4: 5
element 5: 6
6
6619158
```
可以看到,在main函數的頭部也沒有分配空間。當第一次push_back()調用結束后,緩沖區被分配。同時在每一次調用push_back()后,數組的大小和緩沖區容量都增大了。緩沖區地址也變化了,因為push_back()函數每次都會在堆中重新分配緩沖區。這是個耗時的操作,這就是為什么提前預測數組大小,同時調用.reserve()方法預留空間比較重要了。最后數據是垃圾數據,沒有數組元素位于這個位置,因此打印出了隨機的數據。這也說明了std::vector的[]操作并不校驗數組下標是否越界。然而.at()方法會做相應檢查,當出現越界時會拋出std::out_of_range。 讓我們來看代碼:
Listing 34.11: MSVC 2012 /GS- /Ob1
```
$SG52650 DB ’%d’, 0aH, 00H
$SG52651 DB ’%d’, 0aH, 00H
_this$ = -4 ; size = 4
__Pos$ = 8 ; size = 4
?at@?$vector@HV?$allocator@H@std@@@std@@QAEAAHI@Z PROC ; std::vector<int,std::allocator<int> >::
at, COMDAT
; _this$ = ecx
push ebp
mov ebp, esp
push ecx
mov DWORD PTR _this$[ebp], ecx
mov eax, DWORD PTR _this$[ebp]
mov ecx, DWORD PTR _this$[ebp]
mov edx, DWORD PTR [eax+4]
sub edx, DWORD PTR [ecx]
sar edx, 2
cmp edx, DWORD PTR __Pos$[ebp]
ja SHORT $LN1@at
push OFFSET ??_C@_0BM@NMJKDPPO@invalid?5vector?$DMT?$DO?5subscript?$AA@
call DWORD PTR __imp_?_Xout_of_range@std@@YAXPBD@Z
$LN1@at:
mov eax, DWORD PTR _this$[ebp]
mov ecx, DWORD PTR [eax]
mov edx, DWORD PTR __Pos$[ebp]
lea eax, DWORD PTR [ecx+edx*4]
$LN3@at:
mov esp, ebp
pop ebp
ret 4
?at@?$vector@HV?$allocator@H@std@@@std@@QAEAAHI@Z ENDP ; std::vector<int,std::allocator<int> >::
at
_c$ = -36 ; size = 12
$T1 = -24 ; size = 4
$T2 = -20 ; size = 4
$T3 = -16 ; size = 4
$T4 = -12 ; size = 4
$T5 = -8 ; size = 4
$T6 = -4 ; size = 4
_main PROC
push ebp
mov ebp, esp
sub esp, 36 ; 00000024H
mov DWORD PTR _c$[ebp], 0 ; Myfirst
mov DWORD PTR _c$[ebp+4], 0 ; Mylast
mov DWORD PTR _c$[ebp+8], 0 ; Myend
lea eax, DWORD PTR _c$[ebp]
push eax
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
mov DWORD PTR $T6[ebp], 1
lea ecx, DWORD PTR $T6[ebp]
push ecx
lea ecx, DWORD PTR _c$[ebp]
call ?push_back@?$vector@HV?$allocator@H@std@@@std@@QAEX$$QAH@Z ; std::vector<int,std
::allocator<int> >::push_back
lea edx, DWORD PTR _c$[ebp]
push edx
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
mov DWORD PTR $T5[ebp], 2
lea eax, DWORD PTR $T5[ebp]
push eax
lea ecx, DWORD PTR _c$[ebp]
call ?push_back@?$vector@HV?$allocator@H@std@@@std@@QAEX$$QAH@Z ; std::vector<int,std
::allocator<int> >::push_back
lea ecx, DWORD PTR _c$[ebp]
push ecx
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
mov DWORD PTR $T4[ebp], 3
lea edx, DWORD PTR $T4[ebp]
push edx
lea ecx, DWORD PTR _c$[ebp]
call ?push_back@?$vector@HV?$allocator@H@std@@@std@@QAEX$$QAH@Z ; std::vector<int,std
::allocator<int> >::push_back
lea eax, DWORD PTR _c$[ebp]
push eax
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
mov DWORD PTR $T3[ebp], 4
lea ecx, DWORD PTR $T3[ebp]
push ecx
lea ecx, DWORD PTR _c$[ebp]
call ?push_back@?$vector@HV?$allocator@H@std@@@std@@QAEX$$QAH@Z ; std::vector<int,std
::allocator<int> >::push_back
lea edx, DWORD PTR _c$[ebp]
push edx
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
push 6
lea ecx, DWORD PTR _c$[ebp]
call ?reserve@?$vector@HV?$allocator@H@std@@@std@@QAEXI@Z ; std::vector<int,std::
allocator<int> >::reserve
lea eax, DWORD PTR _c$[ebp]
push eax
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
mov DWORD PTR $T2[ebp], 5
lea ecx, DWORD PTR $T2[ebp]
push ecx
lea ecx, DWORD PTR _c$[ebp]
call ?push_back@?$vector@HV?$allocator@H@std@@@std@@QAEX$$QAH@Z ; std::vector<int,std
::allocator<int> >::push_back
lea edx, DWORD PTR _c$[ebp]
push edx
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
mov DWORD PTR $T1[ebp], 6
lea eax, DWORD PTR $T1[ebp]
push eax
lea ecx, DWORD PTR _c$[ebp]
call ?push_back@?$vector@HV?$allocator@H@std@@@std@@QAEX$$QAH@Z ; std::vector<int,std
::allocator<int> >::push_back
lea ecx, DWORD PTR _c$[ebp]
push ecx
call ?dump@@YAXPAUvector_of_ints@@@Z ; dump
add esp, 4
push 5
lea ecx, DWORD PTR _c$[ebp]
call ?at@?$vector@HV?$allocator@H@std@@@std@@QAEAAHI@Z ; std::vector<int,std::
allocator<int> >::at
mov edx, DWORD PTR [eax]
push edx
push OFFSET $SG52650 ; ’%d’
call DWORD PTR __imp__printf
add esp, 8
mov eax, 8
shl eax, 2
mov ecx, DWORD PTR _c$[ebp]
mov edx, DWORD PTR [ecx+eax]
push edx
push OFFSET $SG52651 ; ’%d’
call DWORD PTR __imp__printf
add esp, 8
lea ecx, DWORD PTR _c$[ebp]
call ?_Tidy@?$vector@HV?$allocator@H@std@@@std@@IAEXXZ ; std::vector<int,std::
allocator<int> >::_Tidy
xor eax, eax
mov esp, ebp
pop ebp
ret 0
_main ENDP
```
我們看到.at()方法如何進行辯解檢查,當發生錯誤時將拋出異常。最后一個printf()調用只是從內存中讀取數據,并沒有做任何檢查。 有人可能會問,為什么沒有用類似std::string中size和capacity的變量,我猜想可能是為了讓邊界檢查更快,但是我不確定。 GCC生成的代碼基本上相同,但是.at()方法被內聯了。
Listing34.12: GCC 4.8.1 -fno-inline-small-functions –O1
```
main proc near
push ebp
mov ebp, esp
push edi
push esi
push ebx
and esp, 0FFFFFFF0h
sub esp, 20h
mov dword ptr [esp+14h], 0
mov dword ptr [esp+18h], 0
mov dword ptr [esp+1Ch], 0
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov dword ptr [esp+10h], 1
lea eax, [esp+10h]
mov [esp+4], eax
lea eax, [esp+14h]
mov [esp], eax
call _ZNSt6vectorIiSaIiEE9push_backERKi ; std::vector<int,std::allocator<int
>>::push_back(int const&)
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov dword ptr [esp+10h], 2
lea eax, [esp+10h]
mov [esp+4], eax
lea eax, [esp+14h]
mov [esp], eax
call _ZNSt6vectorIiSaIiEE9push_backERKi ; std::vector<int,std::allocator<int
>>::push_back(int const&)
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov dword ptr [esp+10h], 3
lea eax, [esp+10h]
mov [esp+4], eax
lea eax, [esp+14h]
mov [esp], eax
call _ZNSt6vectorIiSaIiEE9push_backERKi ; std::vector<int,std::allocator<int
>>::push_back(int const&)
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov dword ptr [esp+10h], 4
lea eax, [esp+10h]
mov [esp+4], eax
lea eax, [esp+14h]
mov [esp], eax
call _ZNSt6vectorIiSaIiEE9push_backERKi ; std::vector<int,std::allocator<int
>>::push_back(int const&)
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov ebx, [esp+14h]
mov eax, [esp+1Ch]
sub eax, ebx
cmp eax, 17h
ja short loc_80001CF
mov edi, [esp+18h]
sub edi, ebx
sar edi, 2
mov dword ptr [esp], 18h
call _Znwj ; operator new(uint)
mov esi, eax
test edi, edi
jz short loc_80001AD
lea eax, ds:0[edi*4]
mov [esp+8], eax ; n
mov [esp+4], ebx ; src
mov [esp], esi ; dest
call memmove
loc_80001AD: ; CODE XREF: main+F8
mov eax, [esp+14h]
test eax, eax
jz short loc_80001BD
mov [esp], eax ; void *
call _ZdlPv ; operator delete(void *)
loc_80001BD: ; CODE XREF: main+117
mov [esp+14h], esi
lea eax, [esi+edi*4]
mov [esp+18h], eax
add esi, 18h
mov [esp+1Ch], esi
loc_80001CF: ; CODE XREF: main+DD
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov dword ptr [esp+10h], 5
lea eax, [esp+10h]
mov [esp+4], eax
lea eax, [esp+14h]
mov [esp], eax
call _ZNSt6vectorIiSaIiEE9push_backERKi ; std::vector<int,std::allocator<int
>>::push_back(int const&)
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov dword ptr [esp+10h], 6
lea eax, [esp+10h]
mov [esp+4], eax
lea eax, [esp+14h]
mov [esp], eax
call _ZNSt6vectorIiSaIiEE9push_backERKi ; std::vector<int,std::allocator<int
>>::push_back(int const&)
lea eax, [esp+14h]
mov [esp], eax
call _Z4dumpP14vector_of_ints ; dump(vector_of_ints *)
mov eax, [esp+14h]
mov edx, [esp+18h]
sub edx, eax
cmp edx, 17h
ja short loc_8000246
mov dword ptr [esp], offset aVector_m_range ; "vector::_M_range_check"
call _ZSt20__throw_out_of_rangePKc ; std::__throw_out_of_range(char const*)
loc_8000246: ; CODE XREF: main+19C
mov eax, [eax+14h]
mov [esp+8], eax
mov dword ptr [esp+4], offset aD ; "%d\n"
mov dword ptr [esp], 1
call __printf_chk
mov eax, [esp+14h]
mov eax, [eax+20h]
mov [esp+8], eax
mov dword ptr [esp+4], offset aD ; "%d\n"
mov dword ptr [esp], 1
call __printf_chk
mov eax, [esp+14h]
test eax, eax
jz short loc_80002AC
mov [esp], eax ; void *
call _ZdlPv ; operator delete(void *)
jmp short loc_80002AC
; ---------------------------------------------------------------------------
mov ebx, eax
mov edx, [esp+14h]
test edx, edx
jz short loc_80002A4
mov [esp], edx ; void *
call _ZdlPv ; operator delete(void *)
loc_80002A4: ; CODE XREF: main+1FE
mov [esp], ebx
call _Unwind_Resume
; ---------------------------------------------------------------------------
loc_80002AC: ; CODE XREF: main+1EA
; main+1F4
mov eax, 0
lea esp, [ebp-0Ch]
pop ebx
pop esi
pop edi
pop ebp
locret_80002B8: ; DATA XREF: .eh_frame:08000510
; .eh_frame:080005BC
retn
main endp
```
.reserve()方法也被內聯了。如果緩沖區大小小于新的size,則調用new()申請新緩沖區,調用memmove()拷貝緩沖區內容,然后調用delete()釋放舊的緩沖區。 讓你給我們看看通過GCC編譯后程序的輸出
```
_Myfirst=0x(nil), _Mylast=0x(nil), _Myend=0x(nil)
size=0, capacity=0
_Myfirst=0x8257008, _Mylast=0x825700c, _Myend=0x825700c
size=1, capacity=1
element 0: 1
_Myfirst=0x8257018, _Mylast=0x8257020, _Myend=0x8257020
size=2, capacity=2
element 0: 1
element 1: 2
_Myfirst=0x8257028, _Mylast=0x8257034, _Myend=0x8257038
size=3, capacity=4
element 0: 1
element 1: 2
element 2: 3
_Myfirst=0x8257028, _Mylast=0x8257038, _Myend=0x8257038
size=4, capacity=4
element 0: 1
element 1: 2
element 2: 3
element 3: 4
_Myfirst=0x8257040, _Mylast=0x8257050, _Myend=0x8257058
size=4, capacity=6
element 0: 1
element 1: 2
element 2: 3
element 3: 4
_Myfirst=0x8257040, _Mylast=0x8257054, _Myend=0x8257058
size=5, capacity=6
element 0: 1
element 1: 2
element 2: 3
element 3: 4
element 4: 5
_Myfirst=0x8257040, _Mylast=0x8257058, _Myend=0x8257058
size=6, capacity=6
element 0: 1
element 1: 2
element 2: 3
element 3: 4
element 4: 5
element 5: 6
6
0
```
我們可以看到緩沖區大小的增長不同于MSVC中。 簡單的實驗說明MSVC中緩沖區每次擴大當前大小的50%,而GCC中則每次擴大100%。
## 34.4 std::map and std::set
二叉樹是另一個基本的數據結構。它是一個樹,但是每個節點最多包含兩個指向其他節點的指針。每個節點包含鍵和/或值。 二叉樹在鍵值字典的實現中是常用數據結構。 二叉樹至少包含三個重要的屬性: 所有的鍵的存儲是排序的。 任何類型的鍵都容易存儲。二叉樹并不知道鍵的類型,因此需要鍵比較算法。 查找鍵的速度相比于鏈表和數組要快。 下面是一個非常簡單的例子,我們在二叉樹中存儲下面這些數據0,1,2,3,5,6,9,10,11,12,20,99,100,101,107,1001,1010.
所有小于根節點鍵的鍵存儲在樹的左側,所有大于根節點鍵的鍵存儲在樹的右側。 因此,查找算法就很直接,當要查找的值小于當前節點的鍵的值,則查找左側;如果大于當前節點的鍵的值,則查找右側;當值相等時,則停止查找。這就是為什么通過鍵比較函數,算法可以查找數值、文本串等。 所有鍵的值都是唯一的。 如果這樣,為了在n個鍵的平衡二叉樹中查找一個鍵將需要log2 n步。1000個鍵需要10步,10000個鍵需要13步。但是這要求樹總是平衡的,即鍵應該均勻的分布在樹的不同層里。插入和刪除操作需要保持樹的平衡狀態。 有一些流行的平衡算法可用,包括AVL樹和紅黑樹。紅黑樹通過給節點增加一個color值來簡化平衡過程,因此每個節點可能是紅或者黑。 GCC和MSVC中的std::map和std::set模板的實現均采用了紅黑樹。 std::set只包含鍵。std::map是set的擴展,它在每個節點中包含一個值。
## 34.4.1 MSVC
```
#include <map>
#include <set>
#include <string>
#include <iostream>
// struct is not packed!
struct tree_node
{
struct tree_node *Left;
struct tree_node *Parent;
struct tree_node *Right;
char Color; // 0 - Red, 1 - Black
char Isnil;
//std::pair Myval;
unsigned int first; // called Myval in std::set
const char *second; // not present in std::set
};
struct tree_struct
{
struct tree_node *Myhead;
size_t Mysize;
};
void dump_tree_node (struct tree_node *n, bool is_set, bool traverse)
{
printf ("ptr=0x%p Left=0x%p Parent=0x%p Right=0x%p Color=%d Isnil=%d\n",
n, n->Left, n->Parent, n->Right, n->Color, n->Isnil);
if (n->Isnil==0)
{
if (is_set)
printf ("first=%d\n", n->first);
else
printf ("first=%d second=[%s]\n", n->first, n->second);
}
if (traverse)
{
if (n->Isnil==1)
dump_tree_node (n->Parent, is_set, true);
else
{
if (n->Left->Isnil==0)
dump_tree_node (n->Left, is_set, true);
if (n->Right->Isnil==0)
dump_tree_node (n->Right, is_set, true);
};
};
};
const char* ALOT_OF_TABS="\t\t\t\t\t\t\t\t\t\t\t";
void dump_as_tree (int tabs, struct tree_node *n, bool is_set)
{
if (is_set)
printf ("%d\n", n->first);
else
printf ("%d [%s]\n", n->first, n->second);
if (n->Left->Isnil==0)
{
printf ("%.*sL-------", tabs, ALOT_OF_TABS);
dump_as_tree (tabs+1, n->Left, is_set);
};
if (n->Right->Isnil==0)
{
printf ("%.*sR-------", tabs, ALOT_OF_TABS);
dump_as_tree (tabs+1, n->Right, is_set);
};
};
void dump_map_and_set(struct tree_struct *m, bool is_set)
{
printf ("ptr=0x%p, Myhead=0x%p, Mysize=%d\n", m, m->Myhead, m->Mysize);
dump_tree_node (m->Myhead, is_set, true);
printf ("As a tree:\n");
printf ("root----");
dump_as_tree (1, m->Myhead->Parent, is_set);
};
int main()
{
// map
std::map<int, const char*> m;
m[10]="ten";
m[20]="twenty";
m[3]="three";
m[101]="one hundred one";
m[100]="one hundred";
m[12]="twelve";
m[107]="one hundred seven";
m[0]="zero";
m[1]="one";
m[6]="six";
m[99]="ninety-nine";
m[5]="five";
m[11]="eleven";
m[1001]="one thousand one";
m[1010]="one thousand ten";
m[2]="two";
m[9]="nine";
printf ("dumping m as map:\n");
dump_map_and_set ((struct tree_struct *)(void*)&m, false);
std::map<int, const char*>::iterator it1=m.begin();
printf ("m.begin():\n");
dump_tree_node ((struct tree_node *)*(void**)&it1, false, false);
it1=m.end();
printf ("m.end():\n");
dump_tree_node ((struct tree_node *)*(void**)&it1, false, false);
// set
std::set<int> s;
s.insert(123);
s.insert(456);
s.insert(11);
s.insert(12);
s.insert(100);
s.insert(1001);
printf ("dumping s as set:\n");
dump_map_and_set ((struct tree_struct *)(void*)&s, true);
std::set<int>::iterator it2=s.begin();
printf ("s.begin():\n");
dump_tree_node ((struct tree_node *)*(void**)&it2, true, false);
it2=s.end();
printf ("s.end():\n");
dump_tree_node ((struct tree_node *)*(void**)&it2, true, false);
};
Listing 34.13: MSVC 2012
dumping m as map:
ptr=0x0020FE04, Myhead=0x005BB3A0, Mysize=17
ptr=0x005BB3A0 Left=0x005BB4A0 Parent=0x005BB3C0 Right=0x005BB580 Color=1 Isnil=1
ptr=0x005BB3C0 Left=0x005BB4C0 Parent=0x005BB3A0 Right=0x005BB440 Color=1 Isnil=0
first=10 second=[ten]
ptr=0x005BB4C0 Left=0x005BB4A0 Parent=0x005BB3C0 Right=0x005BB520 Color=1 Isnil=0
first=1 second=[one]
ptr=0x005BB4A0 Left=0x005BB3A0 Parent=0x005BB4C0 Right=0x005BB3A0 Color=1 Isnil=0
first=0 second=[zero]
ptr=0x005BB520 Left=0x005BB400 Parent=0x005BB4C0 Right=0x005BB4E0 Color=0 Isnil=0
first=5 second=[five]
ptr=0x005BB400 Left=0x005BB5A0 Parent=0x005BB520 Right=0x005BB3A0 Color=1 Isnil=0
first=3 second=[three]
ptr=0x005BB5A0 Left=0x005BB3A0 Parent=0x005BB400 Right=0x005BB3A0 Color=0 Isnil=0
first=2 second=[two]
ptr=0x005BB4E0 Left=0x005BB3A0 Parent=0x005BB520 Right=0x005BB5C0 Color=1 Isnil=0
first=6 second=[six]
ptr=0x005BB5C0 Left=0x005BB3A0 Parent=0x005BB4E0 Right=0x005BB3A0 Color=0 Isnil=0
first=9 second=[nine]
ptr=0x005BB440 Left=0x005BB3E0 Parent=0x005BB3C0 Right=0x005BB480 Color=1 Isnil=0
first=100 second=[one hundred]
ptr=0x005BB3E0 Left=0x005BB460 Parent=0x005BB440 Right=0x005BB500 Color=0 Isnil=0
first=20 second=[twenty]
ptr=0x005BB460 Left=0x005BB540 Parent=0x005BB3E0 Right=0x005BB3A0 Color=1 Isnil=0
first=12 second=[twelve]
ptr=0x005BB540 Left=0x005BB3A0 Parent=0x005BB460 Right=0x005BB3A0 Color=0 Isnil=0
first=11 second=[eleven]
ptr=0x005BB500 Left=0x005BB3A0 Parent=0x005BB3E0 Right=0x005BB3A0 Color=1 Isnil=0
first=99 second=[ninety-nine]
ptr=0x005BB480 Left=0x005BB420 Parent=0x005BB440 Right=0x005BB560 Color=0 Isnil=0
first=107 second=[one hundred seven]
ptr=0x005BB420 Left=0x005BB3A0 Parent=0x005BB480 Right=0x005BB3A0 Color=1 Isnil=0
first=101 second=[one hundred one]
ptr=0x005BB560 Left=0x005BB3A0 Parent=0x005BB480 Right=0x005BB580 Color=1 Isnil=0
first=1001 second=[one thousand one]
ptr=0x005BB580 Left=0x005BB3A0 Parent=0x005BB560 Right=0x005BB3A0 Color=0 Isnil=0
first=1010 second=[one thousand ten]
As a tree:
root----10 [ten]
L-------1 [one]
L-------0 [zero]
R-------5 [five]
L-------3 [three]
L-------2 [two]
R-------6 [six]
R-------9 [nine]
R-------100 [one hundred]
L-------20 [twenty]
L-------12 [twelve]
L-------11 [eleven]
R-------99 [ninety-nine]
R-------107 [one hundred seven]
L-------101 [one hundred one]
R-------1001 [one thousand one]
R-------1010 [one thousand ten]
m.begin():
ptr=0x005BB4A0 Left=0x005BB3A0 Parent=0x005BB4C0 Right=0x005BB3A0 Color=1 Isnil=0
first=0 second=[zero]
m.end():
ptr=0x005BB3A0 Left=0x005BB4A0 Parent=0x005BB3C0 Right=0x005BB580 Color=1 Isnil=1
dumping s as set:
ptr=0x0020FDFC, Myhead=0x005BB5E0, Mysize=6
ptr=0x005BB5E0 Left=0x005BB640 Parent=0x005BB600 Right=0x005BB6A0 Color=1 Isnil=1
ptr=0x005BB600 Left=0x005BB660 Parent=0x005BB5E0 Right=0x005BB620 Color=1 Isnil=0
first=123
ptr=0x005BB660 Left=0x005BB640 Parent=0x005BB600 Right=0x005BB680 Color=1 Isnil=0
first=12
ptr=0x005BB640 Left=0x005BB5E0 Parent=0x005BB660 Right=0x005BB5E0 Color=0 Isnil=0
first=11
ptr=0x005BB680 Left=0x005BB5E0 Parent=0x005BB660 Right=0x005BB5E0 Color=0 Isnil=0
first=100
ptr=0x005BB620 Left=0x005BB5E0 Parent=0x005BB600 Right=0x005BB6A0 Color=1 Isnil=0
first=456
ptr=0x005BB6A0 Left=0x005BB5E0 Parent=0x005BB620 Right=0x005BB5E0 Color=0 Isnil=0
first=1001
As a tree:
root----123
L-------12
L-------11
R-------100
R-------456
R-------1001
s.begin():
ptr=0x005BB640 Left=0x005BB5E0 Parent=0x005BB660 Right=0x005BB5E0 Color=0 Isnil=0
first=11
s.end():
ptr=0x005BB5E0 Left=0x005BB640 Parent=0x005BB600 Right=0x005BB6A0 Color=1 Isnil=1
```
結構體沒有打包,因此所有的char類型都占用4個字節。 對于std::map,first和second變量可以看做是一個std::pair型變量。而在std::set中,std::pair只有一個變量。 當前樹的節點個數總被保存著,這與MSVC中的std::list實現方式一樣。 和std::list一樣,迭代器只是指向節點的指針。.begin()迭代器指向最小的鍵。.begin()指針沒有存儲在某個位置(和list一樣),樹中最小的key總是可以找到。當節點有前一個和后一個節點時,- -和++操作會將迭代器移動到前一個或者后一個節點。關于這些操作的算法在文獻七中進行了描述。 .end()迭代器指向根節點,它的Isnil的值為1,即這個節點沒有鍵和值。
## 34.4.2 GCC
```
#include <stdio.h>
#include <map>
#include <set>
#include <string>
#include <iostream>
struct map_pair
{
int key;
const char *value;
};
struct tree_node
{
int M_color; // 0 - Red, 1 - Black
struct tree_node *M_parent;
struct tree_node *M_left;
struct tree_node *M_right;
};
struct tree_struct
{
int M_key_compare;
struct tree_node M_header;
size_t M_node_count;
};
void dump_tree_node (struct tree_node *n, bool is_set, bool traverse, bool dump_keys_and_values)
{
printf ("ptr=0x%p M_left=0x%p M_parent=0x%p M_right=0x%p M_color=%d\n",
n, n->M_left, n->M_parent, n->M_right, n->M_color);
void *point_after_struct=((char*)n)+sizeof(struct tree_node);
if (dump_keys_and_values)
{
if (is_set)
printf ("key=%d\n", *(int*)point_after_struct);
else
{
struct map_pair *p=(struct map_pair *)point_after_struct;
printf ("key=%d value=[%s]\n", p->key, p->value);
};
};
if (traverse==false)
return;
if (n->M_left)
dump_tree_node (n->M_left, is_set, traverse, dump_keys_and_values);
if (n->M_right)
dump_tree_node (n->M_right, is_set, traverse, dump_keys_and_values);
};
const char* ALOT_OF_TABS="\t\t\t\t\t\t\t\t\t\t\t";
void dump_as_tree (int tabs, struct tree_node *n, bool is_set)
{
void *point_after_struct=((char*)n)+sizeof(struct tree_node);
if (is_set)
printf ("%d\n", *(int*)point_after_struct);
else
{
struct map_pair *p=(struct map_pair *)point_after_struct;
printf ("%d [%s]\n", p->key, p->value);
}
if (n->M_left)
{
printf ("%.*sL-------", tabs, ALOT_OF_TABS);
dump_as_tree (tabs+1, n->M_left, is_set);
};
if (n->M_right)
{
printf ("%.*sR-------", tabs, ALOT_OF_TABS);
dump_as_tree (tabs+1, n->M_right, is_set);
};
};
void dump_map_and_set(struct tree_struct *m, bool is_set)
{
printf ("ptr=0x%p, M_key_compare=0x%x, M_header=0x%p, M_node_count=%d\n",
m, m->M_key_compare, &m->M_header, m->M_node_count);
dump_tree_node (m->M_header.M_parent, is_set, true, true);
printf ("As a tree:\n");
printf ("root----");
dump_as_tree (1, m->M_header.M_parent, is_set);
};
int main()
{
// map
std::map<int, const char*> m;
m[10]="ten";
m[20]="twenty";
m[3]="three";
m[101]="one hundred one";
m[100]="one hundred";
m[12]="twelve";
m[107]="one hundred seven";
m[0]="zero";
m[1]="one";
m[6]="six";
m[99]="ninety-nine";
m[5]="five";
m[11]="eleven";
m[1001]="one thousand one";
m[1010]="one thousand ten";
m[2]="two";
m[9]="nine";
printf ("dumping m as map:\n");
dump_map_and_set ((struct tree_struct *)(void*)&m, false);
std::map<int, const char*>::iterator it1=m.begin();
printf ("m.begin():\n");
dump_tree_node ((struct tree_node *)*(void**)&it1, false, false, true);
it1=m.end();
printf ("m.end():\n");
dump_tree_node ((struct tree_node *)*(void**)&it1, false, false, false);
// set
std::set<int> s;
s.insert(123);
s.insert(456);
s.insert(11);
s.insert(12);
s.insert(100);
s.insert(1001);
printf ("dumping s as set:\n");
dump_map_and_set ((struct tree_struct *)(void*)&s, true);
std::set<int>::iterator it2=s.begin();
printf ("s.begin():\n");
dump_tree_node ((struct tree_node *)*(void**)&it2, true, false, true);
it2=s.end();
printf ("s.end():\n");
dump_tree_node ((struct tree_node *)*(void**)&it2, true, false, false);
};
dumping m as map:
ptr=0x0028FE3C, M_key_compare=0x402b70, M_header=0x0028FE40, M_node_count=17
ptr=0x007A4988 M_left=0x007A4C00 M_parent=0x0028FE40 M_right=0x007A4B80 M_color=1
key=10 value=[ten]
ptr=0x007A4C00 M_left=0x007A4BE0 M_parent=0x007A4988 M_right=0x007A4C60 M_color=1
key=1 value=[one]
ptr=0x007A4BE0 M_left=0x00000000 M_parent=0x007A4C00 M_right=0x00000000 M_color=1
key=0 value=[zero]
ptr=0x007A4C60 M_left=0x007A4B40 M_parent=0x007A4C00 M_right=0x007A4C20 M_color=0
key=5 value=[five]
ptr=0x007A4B40 M_left=0x007A4CE0 M_parent=0x007A4C60 M_right=0x00000000 M_color=1
key=3 value=[three]
ptr=0x007A4CE0 M_left=0x00000000 M_parent=0x007A4B40 M_right=0x00000000 M_color=0
key=2 value=[two]
ptr=0x007A4C20 M_left=0x00000000 M_parent=0x007A4C60 M_right=0x007A4D00 M_color=1
key=6 value=[six]
ptr=0x007A4D00 M_left=0x00000000 M_parent=0x007A4C20 M_right=0x00000000 M_color=0
key=9 value=[nine]
ptr=0x007A4B80 M_left=0x007A49A8 M_parent=0x007A4988 M_right=0x007A4BC0 M_color=1
key=100 value=[one hundred]
ptr=0x007A49A8 M_left=0x007A4BA0 M_parent=0x007A4B80 M_right=0x007A4C40 M_color=0
key=20 value=[twenty]
ptr=0x007A4BA0 M_left=0x007A4C80 M_parent=0x007A49A8 M_right=0x00000000 M_color=1
key=12 value=[twelve]
ptr=0x007A4C80 M_left=0x00000000 M_parent=0x007A4BA0 M_right=0x00000000 M_color=0
key=11 value=[eleven]
ptr=0x007A4C40 M_left=0x00000000 M_parent=0x007A49A8 M_right=0x00000000 M_color=1
key=99 value=[ninety-nine]
ptr=0x007A4BC0 M_left=0x007A4B60 M_parent=0x007A4B80 M_right=0x007A4CA0 M_color=0
key=107 value=[one hundred seven]
ptr=0x007A4B60 M_left=0x00000000 M_parent=0x007A4BC0 M_right=0x00000000 M_color=1
key=101 value=[one hundred one]
ptr=0x007A4CA0 M_left=0x00000000 M_parent=0x007A4BC0 M_right=0x007A4CC0 M_color=1
key=1001 value=[one thousand one]
ptr=0x007A4CC0 M_left=0x00000000 M_parent=0x007A4CA0 M_right=0x00000000 M_color=0
key=1010 value=[one thousand ten]
As a tree:
root----10 [ten]
L-------1 [one]
L-------0 [zero]
R-------5 [five]
L-------3 [three]
L-------2 [two]
R-------6 [six]
R-------9 [nine]
R-------100 [one hundred]
L-------20 [twenty]
L-------12 [twelve]
L-------11 [eleven]
R-------99 [ninety-nine]
R-------107 [one hundred seven]
L-------101 [one hundred one]
R-------1001 [one thousand one]
R-------1010 [one thousand ten]
m.begin():
ptr=0x007A4BE0 M_left=0x00000000 M_parent=0x007A4C00 M_right=0x00000000 M_color=1
key=0 value=[zero]
m.end():
ptr=0x0028FE40 M_left=0x007A4BE0 M_parent=0x007A4988 M_right=0x007A4CC0 M_color=0
dumping s as set:
ptr=0x0028FE20, M_key_compare=0x8, M_header=0x0028FE24, M_node_count=6
ptr=0x007A1E80 M_left=0x01D5D890 M_parent=0x0028FE24 M_right=0x01D5D850 M_color=1
key=123
ptr=0x01D5D890 M_left=0x01D5D870 M_parent=0x007A1E80 M_right=0x01D5D8B0 M_color=1
key=12
ptr=0x01D5D870 M_left=0x00000000 M_parent=0x01D5D890 M_right=0x00000000 M_color=0
key=11
ptr=0x01D5D8B0 M_left=0x00000000 M_parent=0x01D5D890 M_right=0x00000000 M_color=0
key=100
ptr=0x01D5D850 M_left=0x00000000 M_parent=0x007A1E80 M_right=0x01D5D8D0 M_color=1
key=456
ptr=0x01D5D8D0 M_left=0x00000000 M_parent=0x01D5D850 M_right=0x00000000 M_color=0
key=1001
As a tree:
root----123
L-------12
L-------11
R-------100
R-------456
R-------1001
s.begin():
ptr=0x01D5D870 M_left=0x00000000 M_parent=0x01D5D890 M_right=0x00000000 M_color=0
key=11
s.end():
ptr=0x0028FE24 M_left=0x01D5D870 M_parent=0x007A1E80 M_right=0x01D5D8D0 M_color=0
```
GCC的實現也很類似。唯一一的不同點就是沒有Isnil元素,因此占用的空間要小于MSVC的實現。跟節點也是作為.end()迭代器指向的位置,同時也不包含鍵和值。
## 34.4.3 重新平衡的演示
下面的例子將向我們展示插入節點后,樹如何重新平衡。
```
#include <stdio.h>
#include <map>
#include <set>
#include <string>
#include <iostream>
struct map_pair
{
int key;
const char *value;
};
struct tree_node
{
int M_color; // 0 - Red, 1 - Black
struct tree_node *M_parent;
struct tree_node *M_left;
struct tree_node *M_right;
};
struct tree_struct
{
int M_key_compare;
struct tree_node M_header;
size_t M_node_count;
};
const char* ALOT_OF_TABS="\t\t\t\t\t\t\t\t\t\t\t";
void dump_as_tree (int tabs, struct tree_node *n)
{
void *point_after_struct=((char*)n)+sizeof(struct tree_node);
printf ("%d\n", *(int*)point_after_struct);
if (n->M_left)
{
printf ("%.*sL-------", tabs, ALOT_OF_TABS);
dump_as_tree (tabs+1, n->M_left);
};
if (n->M_right)
{
printf ("%.*sR-------", tabs, ALOT_OF_TABS);
dump_as_tree (tabs+1, n->M_right);
};
};
void dump_map_and_set(struct tree_struct *m)
{
printf ("root----");
dump_as_tree (1, m->M_header.M_parent);
};
int main()
{
std::set<int> s;
s.insert(123);
s.insert(456);
printf ("123, 456 are inserted\n");
dump_map_and_set ((struct tree_struct *)(void*)&s);
s.insert(11);
s.insert(12);
printf ("\n");
printf ("11, 12 are inserted\n");
dump_map_and_set ((struct tree_struct *)(void*)&s);
s.insert(100);
s.insert(1001);
printf ("\n");
printf ("100, 1001 are inserted\n");
dump_map_and_set ((struct tree_struct *)(void*)&s);
s.insert(667);
s.insert(1);
s.insert(4);
s.insert(7);
printf ("\n");
printf ("667, 1, 4, 7 are inserted\n");
dump_map_and_set ((struct tree_struct *)(void*)&s);
printf ("\n");
};
123, 456 are inserted
root----123
R-------456
11, 12 are inserted
root----123
L-------11
R-------12
R-------456
100, 1001 are inserted
root----123
L-------12
L-------11
R-------100
R-------456
R-------1001
667, 1, 4, 7 are inserted
root----12
L-------4
L-------1
R-------11
L-------7
R-------123
L-------100
R-------667
L-------456
R-------1001
```
- 第一章 CPU簡介
- 第二章 Hello,world!
- 第三章? 函數開始和結束
- 第四章 棧
- Chapter 5 printf() 與參數處理
- Chapter 6 scanf()
- CHAPER7 訪問傳遞參數
- Chapter 8 一個或者多個字的返回值
- Chapter 9 指針
- Chapter 10 條件跳轉
- 第11章 選擇結構switch()/case/default
- 第12章 循環結構
- 第13章 strlen()
- Chapter 14 Division by 9
- chapter 15 用FPU工作
- Chapter 16 數組
- Chapter 17 位域
- 第18章 結構體
- 19章 聯合體
- 第二十章 函數指針
- 第21章 在32位環境中的64位值
- 第二十二章 SIMD
- 23章 64位化
- 24章 使用x64下的SIMD來處理浮點數
- 25章 溫度轉換
- 26章 C99的限制
- 27章 內聯函數
- 第28章 得到不正確反匯編結果
- 第29章 花指令
- 第30章 16位Windows
- 第31章 類
- 三十二 ostream