5---information_schema不是innodb數據字典 · 思考mysql內核之初級系列

上次談到了innodb緩沖區里面有些頁被使用了，這些中有些被數據字典用了。那么什么是數據字典呢？bingxi和alex繼續思考。 ## 1) information_schema不是innodb數據字典 bingxi：“alex，我覺得information_schema這個里面存儲的不是數據字典，為了準確起見，換個說法，information_schema不是innodb數據字典。” alex：“是的，innodb一直有數據字典的概念，而information_schema是在mysql5之后才出現的。因此，information_schema不是innodb數據字典。” bingxi：“alex，這樣說有點牽強。我們首先舉個例子吧。在手冊里面，有這么一段話： 23.4. The INFORMATION_SCHEMA STATISTICS Table The STATISTICS table provides information about table indexes. 這段話表達的意思是：information_schema. statistics存儲的是表索引信息。我們在test數據庫下面建立一個表t1，并且在c1上有一個索引，語句如下： ~~~ create table test.t1 ( id int, name varchar(20), key it1id(id) )engine=innodb; ~~~ 接著我們查詢statistics表中t1的索引信息： mysql> select * from information_schema.statistics where table_name='t1' /G; *************************** 1. row *************************** TABLE_CATALOG: NULL ?TABLE_SCHEMA: test ?? TABLE_NAME: t1 ?? NON_UNIQUE: 1 ?INDEX_SCHEMA: test ?? INDEX_NAME: it1id ?SEQ_IN_INDEX: 1 ? COLUMN_NAME: id ??? COLLATION: A ? CARDINALITY: 0 ???? SUB_PART: NULL ?????? PACKED: NULL ???? NULLABLE: YES ?? INDEX_TYPE: BTREE ????? COMMENT: 1 row in set (0.02 sec) ERROR: No query specified 從中我們可以查到索引的信息，t1表真正只有一個索引么？呵呵，這里先賣個關子，在講innodb數據字典的時候再說這個。現在我們聚焦在it1c1索引上，這些信息確實可以看到一些索引的信息，但是這個不是數據字典表，而僅僅只能供用戶從外部查看使用，不能供mysql內核使用。比如，該索引在數據文件里面存儲在什么地方？不知道根頁信息，就沒法去使用索引。我們再看看真正的innodb數據字典中包含的內容。（見文件D:/mysql-5.1.7-beta/storage/innobase/include/dict0mem.h） ~~~ /* Data structure for an index */ struct dict_index_struct{ …… dict_table_t*?? table;????? //指向所屬的table字典 ulint??????? space;????? ? //索引所在的space …… dict_tree_t*??? tree;? //索引數結構 …… }; /* Data structure for an index tree */ struct dict_tree_struct{ …… ulint??????? space;????? //索引所在的space ulint??????? page;?? //索引的根結點頁號 …… }; ~~~ 通過space,page我們就可以實實在在地在訪問該索引。 ” alex：“頂你，是這樣的。通過show create我們還可以看出這些表是臨時表。 mysql> show create table information_schema.tables /G; *************************** 1. row *************************** ~~~ Table: TABLES Create Table: CREATE TEMPORARY TABLE `TABLES` ( ? `TABLE_CATALOG` varchar(512) default NULL, ? …… ) ENGINE=MEMORY DEFAULT CHARSET=utf8 1 row in set (0.00 sec) ERROR: No query specified ~~~ ” bingxi：“是的” ## 2）information_schema內容分析 alex：“bingxi，盡管information_schema不是innodb的數據字典，我們還是來摸索下information_schema對應的代碼吧。主要的代碼目錄如下： D:/mysql-5.1.7-beta/sql/sql_show.h D:/mysql-5.1.7-beta/sql/sql_show.cpp ” bingxi：“alex，從文件名我們可以看到show，是不是show status,show variables,show processlist等也是在這個文件里面執行。” alex：“是的，沒錯。我們開始吧，先從兩個數據結構開始。先看schema_tables數組。 ~~~ ST_SCHEMA_TABLE schema_tables[]= { ? {"CHARACTER_SETS", charsets_fields_info, create_schema_table, ?? fill_schema_charsets, make_character_sets_old_format, 0, -1, -1, 0}, ? …… ? {"STATUS", variables_fields_info, create_schema_table, fill_status, ?? make_old_format, 0, -1, -1, 1}, ? {"TABLES", tables_fields_info, create_schema_table, ?? get_all_tables, make_old_format, get_schema_tables_record, 1, 2, 0}, ? {"TABLE_CONSTRAINTS", table_constraints_fields_info, create_schema_table, ?? ?get_all_tables, 0, get_schema_constraints_record, 3, 4, 0}, …… }; ~~~ 數組有26個成員，而information_schema的5.1.7版本中只有22個表。這是可以理解的，比如該數組里面有status、variable，而這個在information_schema下是沒有。我們通過show status，show variables來執行。我們接著說這個數組的成員，每個成員是一個數組結構的取值，見下面的定義： ~~~ typedef struct st_schema_table { ? const char* table_name; ? ST_FIELD_INFO *fields_info; ? TABLE *(*create_table)? (THD *thd, struct st_table_list *table_list); ? int (*fill_table) (THD *thd, struct st_table_list *tables, COND *cond); ? int (*old_format) (THD *thd, struct st_schema_table *schema_table); ? int (*process_table) (THD *thd, struct st_table_list *tables, ??????????????????????? TABLE *table, bool res, const char *base_name, ??????????????????????? const char *file_name); ? int idx_field1, idx_field2; ? bool hidden; } ST_SCHEMA_TABLE; ? 我們以tables這樣表為例 ?{"TABLES", tables_fields_info, create_schema_table, ?? get_all_tables, make_old_format, get_schema_tables_record, 1, 2, 0}, ? tables_fields_info表示的就是。 ST_FIELD_INFO tables_fields_info[]= { ? {"TABLE_CATALOG", FN_REFLEN, MYSQL_TYPE_STRING, 0, 1, 0}, ? {"TABLE_SCHEMA",NAME_LEN, MYSQL_TYPE_STRING, 0, 0, 0}, ? {"TABLE_NAME", NAME_LEN, MYSQL_TYPE_STRING, 0, 0, "Name"}, ? {"TABLE_TYPE", NAME_LEN, MYSQL_TYPE_STRING, 0, 0, 0}, ? {"ENGINE", NAME_LEN, MYSQL_TYPE_STRING, 0, 1, "Engine"}, ? {"VERSION", 21 , MYSQL_TYPE_LONG, 0, 1, "Version"}, ? {"ROW_FORMAT", 10, MYSQL_TYPE_STRING, 0, 1, "Row_format"}, ? {"TABLE_ROWS", 21 , MYSQL_TYPE_LONG, 0, 1, "Rows"}, ? {"AVG_ROW_LENGTH", 21 , MYSQL_TYPE_LONG, 0, 1, "Avg_row_length"}, ? {"DATA_LENGTH", 21 , MYSQL_TYPE_LONG, 0, 1, "Data_length"}, ? {"MAX_DATA_LENGTH", 21 , MYSQL_TYPE_LONG, 0, 1, "Max_data_length"}, ? {"INDEX_LENGTH", 21 , MYSQL_TYPE_LONG, 0, 1, "Index_length"}, ? {"DATA_FREE", 21 , MYSQL_TYPE_LONG, 0, 1, "Data_free"}, ? {"AUTO_INCREMENT", 21 , MYSQL_TYPE_LONG, 0, 1, "Auto_increment"}, ? {"CREATE_TIME", 0, MYSQL_TYPE_TIMESTAMP, 0, 1, "Create_time"}, ? {"UPDATE_TIME", 0, MYSQL_TYPE_TIMESTAMP, 0, 1, "Update_time"}, ? {"CHECK_TIME", 0, MYSQL_TYPE_TIMESTAMP, 0, 1, "Check_time"}, ? {"TABLE_COLLATION", 64, MYSQL_TYPE_STRING, 0, 1, "Collation"}, ? {"CHECKSUM", 21 , MYSQL_TYPE_LONG, 0, 1, "Checksum"}, ? {"CREATE_OPTIONS", 255, MYSQL_TYPE_STRING, 0, 1, "Create_options"}, ? {"TABLE_COMMENT", 80, MYSQL_TYPE_STRING, 0, 0, "Comment"}, ? {0, 0, MYSQL_TYPE_STRING, 0, 0, 0} }; ~~~ 這個表示的就是tables表的字段，不考慮這行’ {0, 0, MYSQL_TYPE_STRING, 0, 0, 0}’，對比下desc tables;兩邊是一樣的。 ” Bingxi：“我頂你，我們通過一個例子來看吧，以show status為例。 ~~~ {"STATUS", variables_fields_info, create_schema_table, fill_status, ?? make_old_format, 0, -1, -1, 1}, //根據對比，我們可以知道： // create_schema_table的功能是：TABLE *(*create_table) // fill_status的功能是：int (*fill_table) // make_old_format的功能是：int (*old_format)，這個可以暫時不調試 ? 首先我們查看函數mysql_schema_table，在其中調用了函數create_schema_table。 int mysql_schema_table(THD *thd, LEX *lex, TABLE_LIST *table_list) { ? …… ? // table_list->schema_table對應的結構就是st_schema_table ? //對應的值為：{"STATUS", variables_fields_info, create_schema_table, fill_status, ? // make_old_format, 0, -1, -1, 1}, ? //因此這里的create_table等于訪問create_schema_table ? if (!(table= table_list->schema_table->create_table(thd, table_list))) ? { ??? DBUG_RETURN(1); ? } ?…… } ~~~ create_schema_table函數作用是什么呢？從名字我們可以看出，就是創建表，創建status的臨時表。表的字段有兩個：Variable_name、Value。見下面的代碼。 ~~~ TABLE *create_schema_table(THD *thd, TABLE_LIST *table_list) { ? …… ? List<Item> field_list; ? ST_SCHEMA_TABLE *schema_table= table_list->schema_table; ? ST_FIELD_INFO *fields_info= schema_table->fields_info; ? …… ? //fields_info就是schema_table->fields_info，里面記錄了查詢字段 ? //第一個fields_info->field_name的值是'Variable_name' ? //根據這個值創建了一個item實例，然后丟到field_list這個list里面 ? //第二個fields_info->field_name的值是'Value' ? //同樣根據這個值，再創一個item，同樣丟到field_list這個list里面 ? //這樣field_list就描述了臨時表的列信息 ? for (; fields_info->field_name; fields_info++) ? { ?? …… ??? //屏蔽調ields_info->field_type的差異性 ?????? ? item->max_length= fields_info->field_length * cs->mbmaxlen; ?????? ? item->set_name(fields_info->field_name, ?????? ???????????????? strlen(fields_info->field_name), cs); ??? …… ??? field_list.push_back(item); ??? item->maybe_null= fields_info->maybe_null; ??? field_count++; ? } ? TMP_TABLE_PARAM *tmp_table_param = ??? (TMP_TABLE_PARAM*) (thd->calloc(sizeof(TMP_TABLE_PARAM))); ? tmp_table_param->init(); ? tmp_table_param->table_charset= cs; ? tmp_table_param->field_count= field_count; ? tmp_table_param->schema_table= 1; ? SELECT_LEX *select_lex= thd->lex->current_select; ? //調用函數create_tmp_table ? //可以看到參數中有field_list，也就是字段列表有了 ? //table_list->alias的值是STATUS ? //于是就是創建了臨時表 ? if (!(table= create_tmp_table(thd, tmp_table_param, ??????????????????????????????? field_list, (ORDER*) 0, 0, 0, ??????????????????????????????? (select_lex->options | thd->options | ???????????????????????????????? TMP_TABLE_ALL_COLUMNS), ??????????????????????????????? HA_POS_ERROR, table_list->alias))) ?…… } ~~~ 創建了臨時表，但是光有臨時表是不夠的，因此在查詢執行時，需要將值進行填充 ~~~ void JOIN::exec() { ? …… ? if ((curr_join->select_lex->options & OPTION_SCHEMA_TABLE) && ????? get_schema_tables_result(curr_join)) ? { ??? DBUG_VOID_RETURN; ? } ? …… ｝ ? get_schema_tables_result函數就是調用fill_status的地方，見函數。 bool get_schema_tables_result(JOIN *join) { ? …… ? for (JOIN_TAB *tab= join->join_tab; tab < tmp_join_tab; tab++) ? {? ??? …… ??? // table_list->schema_table對應的結構就是st_schema_table ??? //對應的值為：{"STATUS", variables_fields_info, create_schema_table, fill_status, ??? // make_old_format, 0, -1, -1, 1}, ??? //因此這里的fill_table等于訪問fill_status ??? if (table_list->schema_table->fill_table(thd, table_list, ???????????????????????????????????????????? tab->select_cond)) ????? result= 1; ??? table_list->is_schema_table_processed= TRUE; ?? …… ? } ? …… } ~~~ 于是執行fill_status進行填充數據的操作。 ~~~ int fill_status(THD *thd, TABLE_LIST *tables, COND *cond) { ? DBUG_ENTER("fill_status"); ? LEX *lex= thd->lex; ? const char *wild= lex->wild ? lex->wild->ptr() : NullS; ? int res= 0; ? STATUS_VAR tmp; ? pthread_mutex_lock(&LOCK_status); ? //如果是show global，則需要執行calc_sum_of_all_status進行累加。 ? if (lex->option_type == OPT_GLOBAL) calc_sum_of_all_status(&tmp); ? //進行數據插入操作 ? res= show_status_array(thd, wild, ???????????????????????? (SHOW_VAR *)all_status_vars.buffer, ???????????????????????? OPT_GLOBAL, ???????????????????????? (lex->option_type == OPT_GLOBAL ? ????????????????????????? &tmp: &thd->status_var), "",tables->table); ? pthread_mutex_unlock(&LOCK_status); ? DBUG_RETURN(res); } ~~~ 為了了解得更清楚，我們再看下show_status_array函數。 ~~~ static bool show_status_array(THD *thd, const char *wild, ????????????????????????????? SHOW_VAR *variables, ????????????????????????????? enum enum_var_type value_type, ????????????????????????????? struct system_status_var *status_var, ????????????????????????????? const char *prefix, TABLE *table) { ? //傳遞過來的variables是全局變量：(SHOW_VAR *)all_status_vars.buffer ? //因此對于變量執行循環操作 ? for (; variables->name; variables++) ? { ??? …… ??? restore_record(table, s->default_values); ??? table->field[0]->store(name_buffer, strlen(name_buffer), ??? ???????????????????????system_charset_info); ??? table->field[1]->store(pos, (uint32) (end - pos), system_charset_info); ??? //將記錄插入表 ??? if (schema_table_store_record(thd, table)) ????? DBUG_RETURN(TRUE); ??? …… ? } ? …… } ~~~ 執行到這里，status表里面已經有了所有的數據。然后繼續執行，顯示出來就行了。 ” Alex：“我明白了。其它的也是類似的，差異性也是有的，比如tables需要進行數據文件夾的掃描，呵呵。” Bingxi：“是的，都差不多的。” Alex：“我的建議是，將該cpp文件里面的函數都設置斷點，然后每個語句執行一下。比如select * from information_schema.tables /G，用這樣的方法把該模式下的22個表測試一邊，并測試下show語句，show processlist，show variable,show ceate table test.t1等” Bingxi：“是的” Alex：“已經0點了，早點休息吧。晚安” Bingxi：“晚安”