osquery原始碼解讀之分析shell_history
前面兩篇主要是對osquery的使用進行了說明,本篇文章將會分析osquery的原始碼。本文將主要對shell_history
和process_open_sockets
兩張表進行說明。通過對這些表的實現分析,一方面能夠了解osquery的實現通過SQL查詢系統資訊的機制,另一方面可以加深對Linux系統的理解。
表的說明
shell_history
是用於檢視shell
的歷史記錄,而process_open_sockets
是用於記錄主機當前的網路行為。示例用法如下:
shell_history
osquery> select * from shell_history limit 3; +------+------+-------------------------------------------------------------------+-----------------------------+ | uid| time | command| history_file| +------+------+-------------------------------------------------------------------+-----------------------------+ | 1000 | 0| pwd| /home/username/.bash_history | | 1000 | 0| ps -ef| /home/username/.bash_history | | 1000 | 0| ps -ef | grep java| /home/username/.bash_history | +------+------+-------------------------------------------------------------------+-----------------------------+
process_open_socket
顯示了一個反彈shell的連結。
osquery> select * from process_open_sockets order by pid desc limit 1; +--------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+------------+---------------+ | pid| fd | socket| family | protocol | local_address | remote_address | local_port | remote_port | path | state| net_namespace | +--------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+------------+---------------+ | 115567 | 3| 16467630 | 2| 6| 192.168.2.142 | 192.168.2.143| 46368| 8888|| ESTABLISH| 0| +--------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+------------+---------------+
osquery
整體的程式碼結構十分地清晰。所有表的定義都是位於ofollow,noindex">specs
下面,所有表的實現都是位於osquery/tables
。
我們以shell_history
為例,其表的定義是在specs/posix/shell_history.table
。
table_name("shell_history") description("A line-delimited (command) table of per-user .*_history data.") schema([ Column("uid", BIGINT, "Shell history owner", additional=True), Column("time", INTEGER, "Entry timestamp. It could be absent, default value is 0."), Column("command", TEXT, "Unparsed date/line/command history line"), Column("history_file", TEXT, "Path to the .*_history for this user"), ForeignKey(column="uid", table="users"), ]) attributes(user_data=True, no_pkey=True) implementation("shell_history@genShellHistory") examples([ "select * from users join shell_history using (uid)", ]) fuzz_paths([ "/home", "/Users", ])s
在shell_history.table
中已經定義了相關的資訊,入口是shell_history.cpp
中的genShellHistory()
函式,甚至給出了示例的SQL語句select * from users join shell_history using (uid)
。shell_history.cpp
是位於osquery/tables/system/posix/shell_history.cpp
中。同理,process_open_sockets
的表定義位於specs/process_open_sockets.table
,實現位於osquery/tables/networking/[linux|freebsd|windows]/process_open_sockets.cpp
。可以看到由於process_open_sockets
在多個平臺上面都有,所以在linux/freebsd/windows中都存在process_open_sockets.cpp
的實現。本文主要是以linux
為例。
shell_history實現
前提知識
在分析之前,介紹一下Linux中的一些基本概念。我們常常會看到各種不同的unix shell,如bash
、zsh
、tcsh
、sh
等等。bash
是我們目前最常見的,它幾乎是所有的類unix操作中內建的一個shell。而zsh
相對於bash
增加了更多的功能。我們在終端輸入各種命令時,其實都是使用的這些shell
。
我們在使用者的根目錄下方利用ls -all
就可以發現存在.bash_history
檔案,此檔案就記錄了我們在終端中輸入的所有的命令。同樣地,如果我們使用zsh
,則會存在一個.zsh_history
記錄我們的命令。
同時在使用者的根目錄下還存在.bash_sessions
的目錄,根據這篇文章
的介紹:
A new folder (~/.bash_sessions/) is used to store HISTFILE’s and .session files that are unique to sessions. If $BASH_SESSION or $TERM_SESSION_ID is set upon launching the shell (i.e. if Terminal is resuming from a saved state), the associated HISTFILE is merged into the current one, and the .session file is ran. Session saving is facilitated by means of an EXIT trap being set for a function bash_update_session_state.
.bash_sessions
中儲存了特定SESSION的HISTFILE和.session檔案。如果在啟動shell時設定了$BASH_SESSION
和$TERM_SESSION_ID
。當此特定的SESSION啟動了之後就會利用$BASH_SESSION
和$TERM_SESSION_ID
恢復之前的狀態。這也說明在.bash_sessions
目錄下也會存在*.history
用於記錄特定SESSION的歷史命令資訊。
分析
分析shell_history.cpp
的入口函式genShellHistory()
:
QueryData genShellHistory(QueryContext& context) { QueryData results; // Iterate over each user QueryData users = usersFromContext(context); for (const auto& row : users) { auto uid = row.find("uid"); auto gid = row.find("gid"); auto dir = row.find("directory"); if (uid != row.end() && gid != row.end() && dir != row.end()) { genShellHistoryForUser(uid->second, gid->second, dir->second, results); genShellHistoryFromBashSessions(uid->second, dir->second, results); } } return results; }
遍歷所有的使用者,拿到uid
,gid
和directory
。之後呼叫genShellHistoryForUser()
獲取使用者的shell記錄genShellHistoryFromBashSessions()
和genShellHistoryForUser()
作用類似。
genShellHistoryForUser()
:
void genShellHistoryForUser(const std::string& uid, const std::string& gid, const std::string& directory, QueryData& results) { auto dropper = DropPrivileges::get(); if (!dropper->dropTo(uid, gid)) { VLOG(1) << "Cannot drop privileges to UID " << uid; return; } for (const auto& hfile : kShellHistoryFiles) { boost::filesystem::path history_file = directory; history_file /= hfile; genShellHistoryFromFile(uid, history_file, results); } }
可以看到在執行之前呼叫了:
auto dropper = DropPrivileges::get(); if (!dropper->dropTo(uid, gid)) { VLOG(1) << "Cannot drop privileges to UID " << uid; return; }
用於對gid
和uid
降權,為什麼要這麼做呢?後來詢問外國網友,給了一個很詳盡的答案:
Think about a scenario where you are a malicious user and you spotted a vulnerability(buffer overflow) which none of us has. In the code (osquery which is running usually with root permission) you also know that history files(controlled by you) are being read by code(osquery). Now you stored a shell code (a code which is capable of destroying anything in the system)such a way that it would overwrite the saved rip. So once the function returns program control is with the injected code(shell code) with root privilege. With dropping privilege you reduce the chance of putting entire system into danger. There are other mitigation techniques (e.g. stack guard) to avoid above scenario but multiple defenses are required
簡而言之,osquery
一般都是使用root許可權執行的,如果攻擊者在.bash_history
中注入了一段惡意的shellcode程式碼。那麼當osquery讀到了這個檔案之後,攻擊者就能夠獲取到root
許可權了,所以通過降權
的方式就能夠很好地避免這樣的問題。
/** * @brief The privilege/permissions dropper deconstructor will restore * effective permissions. * * There should only be a single drop of privilege/permission active. */ virtual ~DropPrivileges();
可以看到當函式被析構之後,就會重新恢復對應檔案的許可權。
之後遍歷kShellHistoryFiles
檔案,執行genShellHistoryFromFile()
程式碼。kShellHistoryFiles
在之前已經定義,內容是:
const std::vector<std::string> kShellHistoryFiles = { ".bash_history", ".zsh_history", ".zhistory", ".history", ".sh_history", };
可以發現其實在kShellHistoryFiles
定義的就是常見的bash用於記錄shell history目錄的檔案。最後呼叫genShellHistoryFromFile()
讀取.history
檔案,解析資料。
void genShellHistoryFromFile(const std::string& uid, const boost::filesystem::path& history_file, QueryData& results) { std::string history_content; if (forensicReadFile(history_file, history_content).ok()) { auto bash_timestamp_rx = xp::sregex::compile("^#(?P<timestamp>[0-9]+)$"); auto zsh_timestamp_rx = xp::sregex::compile("^: {0,10}(?P<timestamp>[0-9]{1,11}):[0-9]+;(?P<command>.*)$"); std::string prev_bash_timestamp; for (const auto& line : split(history_content, "\n")) { xp::smatch bash_timestamp_matches; xp::smatch zsh_timestamp_matches; if (prev_bash_timestamp.empty() && xp::regex_search(line, bash_timestamp_matches, bash_timestamp_rx)) { prev_bash_timestamp = bash_timestamp_matches["timestamp"]; continue; } Row r; if (!prev_bash_timestamp.empty()) { r["time"] = INTEGER(prev_bash_timestamp); r["command"] = line; prev_bash_timestamp.clear(); } else if (xp::regex_search( line, zsh_timestamp_matches, zsh_timestamp_rx)) { std::string timestamp = zsh_timestamp_matches["timestamp"]; r["time"] = INTEGER(timestamp); r["command"] = zsh_timestamp_matches["command"]; } else { r["time"] = INTEGER(0); r["command"] = line; } r["uid"] = uid; r["history_file"] = history_file.string(); results.push_back(r); } } }
整個程式碼邏輯非常地清晰。
-
forensicReadFile(history_file, history_content)
讀取檔案內容。 -
定義
bash_timestamp_rx
和zsh_timestamp_rx
的正則表示式,用於解析對應的.history
檔案的內容。 -
for (const auto& line : split(history_content, "\n"))
讀取檔案的每一行,分別利用bash_timestamp_rx
和zsh_timestamp_rx
解析每一行的內容。 -
Row r;...;r["history_file"] = history_file.string();results.push_back(r);
將解析之後的內容寫入到Row
中返回。
自此就完成了shell_history
的解析工作。執行select * from shell_history
就會按照上述的流程返回所有的歷史命令的結果。
對於genShellHistoryFromBashSessions()
函式:
void genShellHistoryFromBashSessions(const std::string &uid,const std::string &directory,QueryData &results) { boost::filesystem::path bash_sessions = directory; bash_sessions /= ".bash_sessions"; if (pathExists(bash_sessions)) { bash_sessions /= "*.history"; std::vector <std::string> session_hist_files; resolveFilePattern(bash_sessions, session_hist_files); for (const auto &hfile : session_hist_files) { boost::filesystem::path history_file = hfile; genShellHistoryFromFile(uid, history_file, results); } } }
genShellHistoryFromBashSessions()
獲取歷史命令的方法比較簡單。
.bash_sessions/*.history genShellHistoryFromFile(uid, history_file, results);
總結
閱讀一些優秀的開源軟體的程式碼,不僅能夠學習到相關的知識更能夠了解到一些設計哲學。
擁有快速學習能⼒的⽩帽子,是不能有短板的。有的只是⼤量的標準板和⼏塊長板。
以上