MySQL · 捉蟲動態 · 5.6 與 5.5 InnoDB 不兼容導致 crash · 數據庫內核月報

## bug 背景 RDS的備份工具用的是 Percona-XtraBackup（后面簡稱PXB），這個工具包里有2個重要的工具，innobackupex和xtrabackup，后者是C編譯出的二進制文件，負責備份 InnoDB 數據，前者是一個Perl 腳本，對后者進行封裝，同時負責備份非 InnoDB 數據。xtrabackup 二進制里內嵌了InnoDB引擎，所以能很好的處理InnoDB數據。在2.2版本之前，PXB 分別針對不同版本的 MySQL 源碼（5.1/5.5/5.6）編譯了不同版本的xtrabackup，以便備份不同版本的MySQL數據，然而在2.2之后，PXB官方覺得5.6已經可以很好的兼容5.1和5.5，所以就只針對 5.6.22 版本的代碼編譯了一個 xtrabackup 二進制文件，關于這個改動可以看官方的BP?[#single binary](https://blueprints.launchpad.net/percona-xtrabackup/+spec/single-binary)。故事就由此發生。。。 ## bug 描述當我們對5.5版本的備份集進行還原的時候，xtrabackup crash 了，報錯信息如下： ~~~ 2015-05-12 19:03:08 7fd9d8258720 InnoDB: Assertion failure in thread 140573610968864 in file pars0pars.cc line 865 InnoDB: Failing assertion: sym_node->table != NULL InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 11:03:08 UTC - xtrabackup got signal 6 ; This could be because you hit a bug or data is corrupted. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Thread pointer: 0x176aeb0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x10000 xtrabackup(my_print_stacktrace+0x35) [0x9f5331] xtrabackup(handle_fatal_signal+0x2bb) [0x7f801b] /lib64/libpthread.so.0() [0x3530c0f500] /lib64/libc.so.6(gsignal+0x35) [0x35f40328a5] /lib64/libc.so.6(abort+0x175) [0x35f4034085] xtrabackup() [0x76bfb0] xtrabackup(pars_update_statement(upd_node_t*, sym_node_t*, void*)+0x30) [0x76c8d8] xtrabackup(yyparse()+0xcb1) [0xa5ef27] xtrabackup(pars_sql(pars_info_t*, char const*)+0xaf) [0x76e06d] xtrabackup(que_eval_sql(pars_info_t*, char const*, unsigned long, trx_t*)+0x85) [0x78eeb2] xtrabackup(row_drop_table_for_mysql(char const*, trx_t*, bool, bool)+0xa98) [0x720a0c] xtrabackup(row_mysql_drop_temp_tables()+0x24c) [0x721503] xtrabackup(recv_recovery_rollback_active()+0x2c) [0x753ebe] xtrabackup(innobase_start_or_create_for_mysql()+0x17aa) [0x7293c4] xtrabackup() [0x607a00] xtrabackup() [0x610204] xtrabackup(main+0x8b8) [0x611674] /lib64/libc.so.6(__libc_start_main+0xfd) [0x35f401ecdd] xtrabackup() [0x604369] ~~~ ## bug 分析從crash的信息`InnoDB: Assertion failure in thread 140573610968864 in file pars0pars.cc line 865`，可以定位的crash的代碼點 ~~~ 862| sym_node->table = dict_table_open_on_name( 863| sym_node->name, TRUE, FALSE, DICT_ERR_IGNORE_NONE); 864| 865| ut_a(sym_node->table != NULL); ~~~ 可以看到，是 InnoDB 在試圖打開一張表的時候，打開失敗，直接 assert 了。分析調用堆棧，是xtrabackup在恢復數據的時候，啟動了內嵌的 InnoDB 引擎，在活躍事務回滾的時候，會將備份時候存在的臨時表全部 drop 掉。在刪除表的時候，除了要刪除表本身，還需要刪除在 InnoDB 系統表中的記錄，刪除記錄是通過內部執行sql的方式做的`que_eval_sql`，其中有這么一段sql ~~~ 4195| "DELETE FROM SYS_TABLESPACES\n" 4196| "WHERE SPACE = space_id;\n" 4197| "DELETE FROM SYS_DATAFILES\n" 4198| "WHERE SPACE = space_id;\n" ~~~ SYS_TABLESPACES 和 SYS_DATAFILES 這2個系統表是在5.6中才有的，5.5是沒有的，所以在調用?`dict_table_open_on_name`?的時候就打不開 SYS_TABLESPACES 表，導致CRASH。 ## bug修復這里給一個簡單的修復方法，就是在調用?`que_eval_sql`?刪除記錄前判斷下當前數據是否是5.6的，如果是就傳原來的sql，如果不是，就傳去掉 SYS_TABLESPACES 和 SYS_DATAFILES 的sql。PXB官方已經確認這個bug?[#1399471](https://bugs.launchpad.net/percona-xtrabackup/+bug/1399471)，目前尚未修復，應該會在下個版本修掉。如果等不及的話，可以有2種選擇： 1\. 代碼修復，用這里提供的方法； 2\. 版本回退，在恢復的時候，臨時用PXB 2.1版本中的 xtrabackup_55 替換 xtrabckup。 ## bug 影響從上面的分析可以得到，要觸發這個bug，需要這些條件： 1\. PXB版本是2.2以上 2\. MySQL版本是5.5/5.1 3\. 備份的時候有持有臨時表的回話。當滿足上面這些條件，恢復數據就會crash。 PXB備份出來的idb數據時間點是不一致的，恢復數據時需要應用redo將數據追到一個一致的點，效果上就相當于 MySQL 異常關閉，buffer pool 有數據沒有落盤，然后重啟應用redo做崩潰恢復。所以這個場景只有在用PXB才容易出現，對正常使用的mysqld來說一般是不會的，因為我們在升級的時候都是正常關閉 mysqld，這是臨時表都被清理干凈了，后再用新版mysqld啟動，所以新版mysqld啟動時就不需要做drop臨時表的操作，自然不會crash了。