最近项目需要做mysql数据迁移,迁移过程中还算顺利,迁移完后启动数据库一切正常,运行到下午时,有业务反馈查询特别慢,数据无法正常展示,通过Navicat连接到数据库,双击打开表正常,但是通过sql查询却一直报错,提示无法连接。
登录到mysql服务器,查看日志发现以下错误:
2023-08-13 15:36:55 0 [ERROR] InnoDB: Corruption of an index tree: table `be_base`.`test_table` index `idx_name_unique_key`, father ptr page no 56124, child page no 51899
PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: SQL NULL;
1: len 11; hex 7878663130363937363435; asc xxf10697645;;
2: len 30; hex 000000000000000000000000000000000000000000000000000000000000; asc ; (total 55 bytes);
2023-08-13 15:36:55 0 [Note] InnoDB: n_owned: 0; heap_no: 144; next rec: 129
PHYSICAL RECORD: n_fields 4; compact format; info bits 16
0: len 18; hex 303030303030303030303030313130303136; asc 000000000000110016;;
1: len 4; hex 38323030; asc 8200;;
2: len 11; hex 7878663130343230333332; asc xxf10420332;;
3: len 4; hex 0000db3c; asc <;;
2023-08-13 15:36:55 0 [Note] InnoDB: n_owned: 0; heap_no: 2; next rec: 175
2023-08-13 15:36:55 0 [ERROR] [FATAL] InnoDB: You should dump + drop + reimport the table to fix the corruption. If the crash happens at database startup. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery. Then dump + drop + reimport.
230813 15:36:55 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.6.4-MariaDB-log
key_buffer_size=536870912
read_buffer_size=2097152
max_used_connections=25
max_threads=8021
thread_count=26
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 82868098 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f309c000a98
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f30b2ffcbc8 thread_stack 0x30000
mysys/stacktrace.c:213(my_print_stacktrace)[0x55807c6d599e]
sql/signal_handler.cc:225(handle_fatal_signal)[0x55807c0a71d7]
sigaction.c:0(__restore_rt)[0x7f3301a28630]
2023-08-13 15:36:56 29 [Note] Start binlog_dump to slave_server(118), pos(maria-bin.002609, 390), using_gtid(0), gtid('')
:0(__GI_raise)[0x7f3300e74387]
:0(__GI_abort)[0x7f3300e75a78]
ut/ut0vec.cc:38(ib_vector_create(ib_alloc_t*, unsigned long, unsigned long))[0x55807c4f3170]
btr/btr0btr.cc:812(page_align)[0x55807c4f8711]
btr/btr0btr.cc:3429(btr_compress(btr_cur_t*, unsigned long, mtr_t*))[0x55807c4fe735]
btr/btr0cur.cc:5457(btr_cur_compress_if_useful(btr_cur_t*, unsigned long, mtr_t*))[0x55807c5130b5]
btr/btr0cur.cc:5900(btr_cur_pessimistic_delete(dberr_t*, unsigned long, btr_cur_t*, unsigned long, bool, mtr_t*))[0x55807c51e0c6]
row/row0purge.cc:455(row_purge_remove_sec_if_poss_tree(purge_node_t*, dict_index_t*, dtuple_t const*))[0x55807c48d5b6]
row/row0purge.cc:666(row_purge_remove_sec_if_poss)[0x55807c48e7b6]
row/row0purge.cc:1227(row_purge)[0x55807c48f485]
que/que0que.cc:653(que_thr_step)[0x55807c44c167]
trx/trx0purge.cc:1221(trx_purge_wait_for_workers_to_complete)[0x55807c4bf582]
srv/srv0srv.cc:1788(srv_do_purge)[0x55807c4aec07]
tpool/task_group.cc:57(tpool::task_group::execute(tpool::task*))[0x55807c602396]
tpool/tpool_generic.cc:550(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0x55807c600d41]
??:0(std::this_thread::__sleep_for(std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >))[0x7f33015c3070]
pthread_create.c:0(start_thread)[0x7f3301a20ea5]
??:0(__clone)[0x7f3300f3c8dd]
根据错误提示:
Corruption of an index tree: table be_base.test_table index idx_name_unique_key, father ptr page no 56124, child page no 51899
大致判断应该是mysql索引损坏导致的,于是根据错误提示尝试删除问题索引,并重启mysql,重新添加索引,问题解决。
后续
本以为问题已彻底解决,结果第二天发现又出现相同的问题,只不过表名跟索引不同,继续尝试删除问题表中的索引,然后尝试添加索引时无法添加,添加索引的语句一直在执行中,无奈只能使用kill pid强制终止,然后尝试重启数据库后,再次添加索引成功。
mysql表检测
由于后续不确定有多少表索引损坏,尝试从网上查询资料发现可以使用如下命令进行检测:
mariadb-check -c 库名 -S /tmp/mariadb13306.sock -uroot -p
如果所有表都OK则一切正常,如果中间有出现报错或未成功执行完毕,则需要去检查mysql日志,里面会有具体错误原因