## MySQL專題四:InnoDB存儲引擎
[TOC]
> InnoDB是Mysql默認的存儲引擎
> CREATE TABLE t (i INT) ENGINE = InnoDB;
### 4.1.磁盤物理存儲結構
#### 4.1.1. [Record物理結構](https://dev.mysql.com/doc/internals/en/innodb-record-structure.html)
| Name | Size |
| --- | --- |
| Field Start Offsets | (F\*1) or (F\*2) bytes |
| Extra Bytes | 6 bytes |
| Field Contents | depends on content |
1. 字段開始偏移量(Field Start Offset)
記錄中每一個字段相對于原點(第一個Field Contents開始的位置)的偏移量的集合取反后的列表,每個Offset大小為1個或2個字節
2. 額外的字節(Extra Bytes )
最重要的是**1byte_offs_flag**,標志偏移量的Offset大小是1個或2個字節,1 代表**1-byte offsets**,0代表**2-byte offsets**
| Name | Size | Description |
| --- | --- | --- |
| info_bits: | - |- |
| () | 1 bit |unused or unknown |
| () | 1 bit |unused or unknown |
| deleted_flag | 1 bit |1 if record is deleted |
| min_rec_flag | 1 bit |1 if record is predefined minimum record |
| n_owned | 4 bits |number of records owned by this record |
| heap_no | 13 bits |record's order number in heap of index page |
| n_fields | 10 bits |number of fields in this record, 1 to 1023 |
| **1byte_offs_flag** | 1 bit |1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag) |
| next 16 bits | 16 bits |pointer to next record in page |
| **TOTAL** | 48 bits |- |
3. 字段內容(Field Contents )
用戶自定義表后,DBMS會在表中額外增加三個系統字段:
- row ID
- transaction ID
- rollback pointer
例,向T表中插入一條記錄:
```
CREATE TABLE T (FIELD1 VARCHAR(3),
FIELD2 VARCHAR(3),
FIELD3 VARCHAR(3)) ;
```
```
INSERT INTO T VALUES ('PP', 'PP', 'PP');
```
```
ha_write_row19 17 15 13 0C 06 Field Start Offsets /* First Row */
00 00 78 0D 02 BF Extra Bytes
00 00 00 00 04 21 System Column #1
00 00 00 00 09 2A System Column #2
80 00 00 00 2D 00 84 System Column #3
50 50 Field1 'PP'
50 50 Field2 'PP'
50 50 Field3 'PP'
16 15 14 13 0C 06 Field Start Offsets /* Second Row */
00 00 80 0D 02 E1 Extra Bytes
00 00 00 00 04 22 System Column #1
00 00 00 00 09 2B 80 System Column #2
00 00 00 2D 00 84 System Column #3
51 Field1 'Q'
51 Field2 'Q'
51 Field3 'Q'
94 94 14 13 0C 06 Field Start Offsets /* Third Row */
00 00 88 0D 00 74 Extra Bytes
00 00 00 00 04 23 System Column #1
00 00 00 00 09 2C System Column #2
80 00 00 00 2D 00 84 System Column #3
52 Field1 'R'
```
#### 4.1.2. [Page物理結構](https://dev.mysql.com/doc/internals/en/innodb-page-structure.html)
Page用于存儲記錄,每個Page的大小固定為16KB,結構如下:
* Fil Header
* Page Header
* Infimum + Supremum Records
* User Records
* Free Space
* Page Directory
* Fil Trailer
1. Fil Header
**FIL_PAGE_PREV**和 **FIL_PAGE_NEXT** :B+Tree數據結構中指針,指向Previous Page和Next Page
| Name | Size | Remarks |
| --- | --- | --- |
| FIL_PAGE_SPACE | 4 |4 ID of the space the page is in |
| FIL_PAGE_OFFSET | 4 |ordinal page number from start of space |
| **FIL_PAGE_PREV** | 4 |offset of previous page in key order |
| **FIL_PAGE_NEXT** | 4 |offset of next page in key order |
| FIL_PAGE_LSN | 8|log serial number of page's latest log record |
| FIL_PAGE_TYPE | 2 |current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST |
| FIL_PAGE_FILE_FLUSH_LSN | 8 |"the file has been flushed to disk at least up to this lsn" (log serial number), valid only on the first page of the file |
| FIL_PAGE_ARCH_LOG_NO | 4 |the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was written (in the log) |
#### 4.1.3. B+Tree樹結構
InnoDB是以每一個Page為節點的B-Tree結構的存儲引擎。
在[B-樹](http://www.btechsmartclass.com/data_structures/b-trees.html)的結構上,InnoDB由于 **FIL_PAGE_PREV**和 **FIL_PAGE_NEXT**指針的存在,可以從一個葉節點出發訪問另一個葉節點,而不必每次回到根節點,這就是為什么InnoDB應該被稱為B+樹。
### 4.2. 內存緩存存儲模式
**BufferPool**
緩沖池是主內存中的一個區域,InnoDB在訪問表和索引數據時將其緩存。緩沖池允許直接從內存中處理經常使用的數據加快了處理速度。
BufferPool被實現作為Page的列表,使用了LRU算法進行管理,很少使用的數據使用LRU算法會變淘汰。
> 對前面2種存儲做一個總結:

### 4.3. 其它內置存儲引擎
```
mysql> show engines;
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO |
| MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO |
| CSV | YES | CSV storage engine | NO | NO | NO |
| FEDERATED | NO | Federated MySQL storage engine | NULL | NULL | NULL |
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| MyISAM | YES | MyISAM storage engine | NO | NO | NO |
| InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | NO | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
```
### 4.4. InnoDB的事務與鎖
#### 4.4.1. 事務([Transactions](https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/glossary.html#glos_transaction)):
**Transactions**是可以被提交(`commit`)和回滾(`rollback`)作業的原子單元,假設一個事務對數據庫做了很多操作,要么當事務提交的時候操作數據庫成功,要么當事務回滾的時候數據庫不發生任何改變。
在Mysql數據庫存儲引擎中只有InnnoDB實現了事務,具有**ACID**屬性,包括原子性(`atomicity`)、一致性(`consistency`)、隔離性(`isolation`)和持久性(`durability`)。
#### 4.4.2. 行級鎖([row-level locking](https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/glossary.html#glos_row_lock)):
**rw-lock**等級為行的鎖,多個事務可以并發地修改同一張表(table),但是當修改同一行(row),同一時間只能有一個事務可以修改,另一個必須等前面的事務完成或者釋放行級鎖(row locks)才能進行修改操作。
rw-lock包括三種類型的鎖:(`shared、exclusive`)
- 共享鎖(`s-locks`):可以對公共資源的讀取訪問
- 獨占鎖(`x-locks`): 可以對公共資源的寫訪問,但是不允許其它線程不一致地讀取
- 共享獨占鎖(`sx-locks`):可以對公共資源的寫訪問,也允許其它線程不一致地讀取
| | S | X | SX |
| --- | --- | --- | --- |
| S | 兼容 | 兼容 | 沖突 |
| X | 兼容 | 沖突 | 沖突 |
| SX | 沖突 | 沖突 | 沖突 |
- JavaCook
- Java專題零:類的繼承
- Java專題一:數據類型
- Java專題二:相等與比較
- Java專題三:集合
- Java專題四:異常
- Java專題五:遍歷與迭代
- Java專題六:運算符
- Java專題七:正則表達式
- Java專題八:泛型
- Java專題九:反射
- Java專題九(1):反射
- Java專題九(2):動態代理
- Java專題十:日期與時間
- Java專題十一:IO與NIO
- Java專題十一(1):IO
- Java專題十一(2):NIO
- Java專題十二:網絡
- Java專題十三:并發編程
- Java專題十三(1):線程與線程池
- Java專題十三(2):線程安全與同步
- Java專題十三(3):內存模型、volatile、ThreadLocal
- Java專題十四:JDBC
- Java專題十五:日志
- Java專題十六:定時任務
- Java專題十七:JavaMail
- Java專題十八:注解
- Java專題十九:淺拷貝與深拷貝
- Java專題二十:設計模式
- Java專題二十一:序列化與反序列化
- 附加專題一:MySQL
- MySQL專題零:簡介
- MySQL專題一:安裝與連接
- MySQL專題二:DDL與DML語法
- MySQL專題三:工作原理
- MySQL專題四:InnoDB存儲引擎
- MySQL專題五:sql優化
- MySQL專題六:數據類型
- 附加專題二:Mybatis
- Mybatis專題零:簡介
- Mybatis專題一:配置文件
- Mybatis專題二:映射文件
- Mybatis專題三:動態SQL
- Mybatis專題四:源碼解析
- 附加專題三:Web編程
- Web專題零:HTTP協議
- Web專題一:Servlet
- Web專題二:Cookie與Session
- 附加專題四:Redis
- Redis專題一:數據類型
- Redis專題二:事務
- Redis專題三:key的過期
- Redis專題四:消息隊列
- Redis專題五:持久化