1. 程式人生 > >Memcache官網文件精選整理

Memcache官網文件精選整理

MemCache全名可以看做MemoryCache。功能是做快取。為什麼不說他是一個快取資料庫呢,因為它很輕量,只是儲存一般的K-V鍵值對,相比Redis而言,Redis功能更多,比如分散式、持久化、支援的資料結構多這些特點而言,MemCache更簡單,只基於LruCache做快取。

硬體層

1、如果你有4G記憶體,Server APP用了2G,那給Memcache分配1.5G就可以
2、不要把Memcache和Database一起部署,應該給Database更大的記憶體。
3、儘量單獨部署,比如在一個64G記憶體的機器上。這樣可以隨時增加記憶體容量,而不是增加很多Server(說的是儘量不要和你的web應用部署在一起)
演算法:
採用一致性雜湊,一致性雜湊是一種模型,它允許在新增或刪除伺服器時更穩定地分發金鑰。在普通的雜湊演算法中,改變伺服器的數量會導致許多金鑰被重新對映到不同的伺服器,從而導致大量的快取丟失。一致性雜湊描述了將鍵對映到伺服器列表的方法,其中新增或刪除伺服器會導致鍵對映到的位置的極小變化。使用普通的雜湊函式,新增第十一個伺服器可能會導致40%以上的金鑰突然指向不同於普通的伺服器。如果使用一致的雜湊演算法,新增第十一個伺服器會導致少於10%的金鑰被重新分配。在實踐中,這會有少量差異。

Memcache的github自己還寫了個場景故事來炫耀memcache的強大功能。。
傳送門–>https://github.com/memcached/memcached/wiki/TutorialCachingStory

命令

儲存命令(Storage commands)

set

最常用的命令。儲存該資料,可能覆蓋任何現有資料,新item在LRU的頂部。

add

僅在不存在資料的情況下儲存此資料。新item在LRU的頂部。如果某個項已經存在,並且新增失敗,則將其提升到LRU的頂部。

replace

儲存此資料,但僅當資料已經存在時。幾乎從不使用,並且存在協議完整性(設定、新增、替換等)

append

在現有item的最後一個位元組之後新增此資料。不允許超出專案限制,用於管理列表。

prepare

與追加相同,但在現有資料之前新增新資料。

cas

Check And Set或Compare And Swap。用於儲存資料,但僅當自上次讀取資料以來沒有其他人更新過資料時使用,用於解決更新快取資料的競爭條件。

檢索命令(Retrieval Commands)

get

用於檢索資料的命令,獲取一個或多個鍵並返回所有找到的項。

gets

與CAS一起使用的另一個get命令。返回具有該項的CAS識別符號(唯一64位數字),用CAS命令返回這個值。如果專案的CAS值在您得到它之後發生了變化,它就不會被儲存。

Deletion命令

delete

如果存在,則從快取中移除項。

Increment/Decrement

incr/decr

增量和遞減。只能用正值來表示,如果一個值尚未存在,則ICR/DECR將失敗。

Touch

touch

用於更新現有專案的過期時間。

touch <key> <exptime> [noreply]

Get And Touch

gat/gats

用於更新現有專案的過期時間,並且獲取該項。

gat <exptime> <key>*\r\n
gats <exptime> <key>*\r\n

Slabs Reassign

slabs reassign

執行時重新分配記憶體

slabs reassign <source class> <dest class>\r\n

slabs automove

由後臺執行緒決定是否重新移動slab(重新分配記憶體)

slabs automove <0|1>

其他

官網給出的例子,並表示很好用。

 # Don't load little bobby tables
sql = "SELECT * FROM user WHERE user_id = ?"
key = 'SQL:' . user_id . ':' . md5sum(sql)
 # We check if the value is 'defined', since '0' or 'FALSE' # can be
 # legitimate values!
if (defined result = memcli:get(key)) {
	return result
} else {
	handler = run_sql(sql, user_id)
	# Often what you get back when executing SQL is a special handler
	# object. You can't directly cache this. Stick to strings, arrays,
	# and hashes/dictionaries/tables
	rows_array = handler:turn_into_an_array
	# Cache it for five minutes
	memcli:set(key, rows_array, 5 * 60)
	return rows_array
}

建議的key選擇
···
key = ‘SQL’ . query_id . ‘:’ . m5sum(“SELECT blah blah blah”)
···

Key Usage
Thinking about your keys can save you a lot of time and memory. Memcached is a hash, but it also remembers the full key internally. The longer your keys are, the more bytes memcached has to hash to look up your value, and the more memory it wastes storing a full copy of your key.
On the other hand, it should be easy to figure out exactly where in your code a key came from. Otherwise many laborous hours of debugging wait for you.
Avoid User Input
It’s very easy to compromise memcached if you use arbitrary user input for keys. The ASCII protocol uses spaces and newlines. Ensure that neither show up your keys, live long and prosper. Binary protocol does not have this issue.
Short Keys
64-bit UID’s are clever ways to identify a user, but suck when printed out. 18446744073709551616. 20 characters! Using base64 encoding, or even just hexadecimal, you can cut that down by quite a bit.
With the binary protocol, it’s possible to store anything, so you can directly pack 4 bytes into the key. This makes it impossible to read back via the ASCII protocol, and you should have tools available to simply determine what a key is.
Informative Keys
key = ‘SQL’ . md5sum(“SELECT blah blah blah”)
… might be clever, but if you’re looking at this key via tcpdump, strace, etc. You won’t have any clue where it’s coming from.
In this particular example, you may put your SQL queries into an outside file with the md5sum next to them. Or, more simply, appending a unique query ID into the key.
key = ‘SQL’ . query_id . ‘:’ . m5sum(“SELECT blah blah blah”)

不建議用memcached快取session,memcached主要用來減少資料庫的I/O,如果一個機器down掉,則快取資料消失,使用者是有感知的。這種最好還是用DB或Redis做。

Memcached所有操作內部都是原子性的(atomic)

效能

預期TPS

On a fast machine with very high speed networking, memcached can easily handle 200,000+ requests per second. With heavy tuning or even faster hardware it can go many times that. Hitting it a few hundred times per second, even on a slow machine, usually isn’t cause for concern.

響應迅速,考慮網路抖動,OS,CPU的延遲後,很少有響應會超過1-2毫秒

On a good day memcached can serve requests in less than a millisecond. After accounting for outliers due to OS jitter, CPU scheduling, or network jitter, very few commands should take more than a millisecond or two to complete.

理論上不會有連線client限制

Since memcached uses an event based architecture, a high number of clients will not generally slow it down. Users have hundreds of thousands of connected clients, working just fine.

但是會有硬體上的限制

Each connected client uses some TCP memory. You can only connect as many clients as you have spare RAM

如果有很多連線,可以考慮使用TCP長連線或UDP連線。
可以進行TCP調優,涉及到OS和網絡卡的引數。

High connection churn requires OS tuning. You will run out of local ports, TIME_WAIT buckets, and similar. Do research on how to properly tune the TCP stack for your OS.

叢集最大的節點數問題
從客戶端角度來看,客戶端會在啟動時,計算Server的hash table,並不會在每次請求都計算一遍。
如果禁止TCP長連線,太多的客戶端需要給每個服務端建立TCP連線,太多的Server必然會造成RAM空間的浪費。(3次握手造成的浪費)

超時情況

首先檢查listen_disabled_num,這個當連線數達到最大(maxconn)時,新的connection會被延遲。
檢查是否OS在將快取資料交換到硬碟
檢查CPU使用率
牆裂建議使用64位機器,32位只能定址4G記憶體
使用下面工具檢查每個instance

http://www.memcached.org/files/mc_conn_tester.pl

儘量不要在memcache前面部署防火牆