1. 程式人生 > >【轉載】誰記錄了mysql error log中的超長信息

【轉載】誰記錄了mysql error log中的超長信息

acc code limit max log request manager process run

轉載: https://www.cnblogs.com/DataArt/p/10260994.html

【問題】

最近查看MySQL的error log文件時,發現有很多服務器的文件中有大量的如下日誌,內容很長(大小在200K左右),從記錄的內容看,並沒有明顯的異常信息。

有一臺測試服務器也有類似的問題,為什麽會記錄這些信息,是誰記錄的這些信息,分析的過程比較周折。

Status information:

Current dir:

Running threads: 2452 Stack size: 262144

Current locks:

lock: 0x7f783f5233f0:

Key caches:

default

Buffer_size: 8388608

Block_size: 1024

Division_limit: 100

Age_limit: 300

blocks used: 10

not flushed: 0

w_requests: 6619

writes: 1

r_requests: 275574

reads: 1235

handler status:

read_key: 32241480828

read_next: 451035381896

read_rnd 149361175

read_first: 1090473

write: 4838429521

delete 12155820

update: 3331297842

【分析過程】

1、首先在官方文檔中查到,當mysqld進程收到SIGHUP信號量時,就會輸出類似的信息,

On Unix, signals can be sent to processes. mysqld responds to signals sent to it as follows:

SIGHUP causes the server to reload the grant tables and to flush tables, logs, the thread cache, and the host cache. These actions are like various forms of the FLUSH statement. The server also writes a status report to the error log that has this format:

https://dev.mysql.com/doc/refman/5.6/en/server-signal-response.html

2、有別的程序在kill mysqld進程嗎,用systemtap腳本監控kill命令

probe nd_syscall.kill

{

target[tid()] = uint_arg(1);

signal[tid()] = uint_arg(2);

}

probe nd_syscall.kill.return

{

if (target[tid()] != 0) {

printf("%-6d %-12s %-5d %-6d %6d\n", pid(), execname(),

signal[tid()], target[tid()], int_arg(1));

delete target[tid()];

delete signal[tid()];

}

}

用下面命令測試,確實會在error log中記錄日誌

kill -SIGHUP 12455

從systemtap的輸出看到12455就是mysqld進程,被kill掉了,信號量是1,對應的就是SIGHUP

不過在測試環境後面問題重現時,卻沒有抓到SIGHUP的信號量。

FROM COMMAND SIG TO RESULT

17010 who 0 12153 1340429600

36681 bash 1 12455 642

3、看來並不是kill導致的,後面用gdb attach到mysqld進程上,在error log的三個入口函數sql_print_error,sql_print_warning,sql_print_information加上斷點

但是在問題重現時,程序並沒有停在斷點處

4、寫error log還有別的分支嗎,翻源碼找到了答案,原來是通過mysql_print_status函數直接寫到error log中

void mysql_print_status()

{

char current_dir[FN_REFLEN];

STATUS_VAR current_global_status_var;

printf("\nStatus information:\n\n");

(void) my_getwd(current_dir, sizeof(current_dir),MYF(0));

printf("Current dir: %s\n", current_dir);

printf("Running threads: %u Stack size: %ld\n",

Global_THD_manager::get_instance()->get_thd_count(),

(long) my_thread_stack_size);

puts("");

fflush(stdout);

}

5、再次用gdb attach到mysqld進程上,在mysql_print_status函數上加斷點,在問題重現時,線程停在斷點處,通過ps的結果多次對比,判斷是pt-stalk工具運行時調用了mysql_print_status

技術分享圖片

6、從堆棧中看到dispatch_command調用了mysql_print_status,下面是具體的邏輯,當command=COM_DEBUG時就會執行到mysql_print_status

case COM_DEBUG:

thd->status_var.com_other++;

if (check_global_access(thd, SUPER_ACL))

break; /* purecov: inspected */

mysql_print_status();

query_logger.general_log_print(thd, command, NullS);

my_eof(thd);

break;

7、查看pt-stalk的代碼

if [ "$mysql_error_log" -a ! "$OPT_MYSQL_ONLY" ]; then

log "The MySQL error log seems to be $mysql_error_log"

tail -f "$mysql_error_log" >"$d/$p-log_error" &

tail_error_log_pid=$!

$CMD_MYSQLADMIN $EXT_ARGV debug

else

log "Could not find the MySQL error log"

在調用mysqladmin時使用了debug模式

debug Instruct server to write debug information to log

8、在percona官網上搜到了相關的bug描述,目前bug還未修復,會在下個版本中3.0.13中修復。

https://jira.percona.com/browse/PT-1340

技術分享圖片

【解決方案】

定位到問題後,實際修復也比較簡單,將pt-stalk腳本中$CMD_MYSQLADMIN $EXT_ARGV debug中的debug去掉就可以了,測試生效。

總結:

(1) 通過mysql_print_status函數直接寫到error log中

(2) 執行mysqladmin debug

(3) 資源緊張,kill session等 (同時參考: https://dev.mysql.com/doc/refman/5.7/en/server-signal-response.html)

Status information:

Current dir: /data/mysql/mysql3306/data/
Running threads: 7 Stack size: 262144
Current locks:
lock: 0x7fdcb0a44780:

lock: 0x7fdcaf0ea980:

lock: 0x1edb5a0:

..........

..........


Key caches:
default
Buffer_size: 8388608
Block_size: 1024
Division_limit: 100
Age_limit: 300
blocks used: 9
not flushed: 0
w_requests: 0
writes: 0
r_requests: 82
reads: 13


handler status:
read_key: 16981474
read_next: 33963080
read_rnd 6
read_first: 192
write: 21270
delete 0
update: 16981221

Table status:
Opened tables: 956
Open tables: 206
Open files: 13
Open streams: 0

Memory status:
<malloc version="1">
<heap nr="0">
<sizes>
<unsorted from="140586808432240" to="140585778669336" total="0" count="140585778669312"/>
</sizes>
<total type="fast" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<system type="current" size="0"/>
<system type="max" size="0"/>
<aspace type="total" size="0"/>
<aspace type="mprotect" size="0"/>
</heap>
<total type="fast" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="mmap" count="0" size="0"/>
<system type="current" size="0"/>
<system type="max" size="0"/>
<aspace type="total" size="0"/>
<aspace type="mprotect" size="0"/>
</malloc>

Events status:
LLA = Last Locked At LUA = Last Unlocked At
WOC = Waiting On Condition DL = Data Locked

Event scheduler status:
State : INITIALIZED
Thread id : 0
LLA : n/a:0
LUA : n/a:0
WOC : NO
Workers : 0
Executed : 0
Data locked: NO

Event queue status:
Element count : 0
Data locked : NO
Attempting lock : NO
LLA : init_queue:96
LUA : init_queue:104
WOC : NO
Next activation : never

【轉載】誰記錄了mysql error log中的超長信息