HBase實戰案例之使用Scanner獲取資料
阿新 • • 發佈:2018-11-17
HBase 實戰案例之使用Scanner獲取資料
1.Java API 簡介
1.1 getScanner()
getScanner
方法有三個過載模型,分別如下:
getScanner(Scan scan)
/**
* Returns a scanner on the current table as specified by the {@link Scan}
* object.
* 返回當前表上由Scan物件指定的一個scanner
*
* Note that the passed {@link Scan}'s start row and caching properties
* maybe changed.
*注意:傳遞的Scan的起始行以及緩衝引數可能會被改變【這是什麼意思?】
* @param scan A configured {@link Scan} object.
* @return A scanner.
* @throws IOException if a remote or network exception occurs.
* @since 0.20.0
*/
ResultScanner getScanner(Scan scan) throws IOException;
getScanner(byte[] family)
/**
* Gets a scanner on the current table for the given family.
* 在當前的表,以及指定的列族上獲取一個scanner(掃描器)
* @param family The column family to scan.
* @return A scanner.
* @throws IOException if a remote or network exception occurs.
* @since 0.20.0
*/
ResultScanner getScanner(byte[] family) throws IOException;
getScanner(byte[] family, byte[] qualifier)
/**
* Gets a scanner on the current table for the given family and qualifier.
* 返回一個當前表中給定的列族和限定符所表示的scanner
*
* @param family The column family to scan.
* @param qualifier The column qualifier to scan.
* @return A scanner.
* @throws IOException if a remote or network exception occurs.
* @since 0.20.0
*/
ResultScanner getScanner(byte[] family, byte[] qualifier) throws IOException;
2.實戰程式碼
2.1 分別針對上述api,進行測試。在測試之前,請看tsdb-uid
表中的資料,如下:
\x00 column=id:metrics, timestamp=1541500656882, value=\x00\x00\x00\x00\x00\x00\x00\x05
\x00 column=id:tagk, timestamp=1535982247222, value=\x00\x00\x00\x00\x00\x00\x00\x03
\x00 column=id:tagv, timestamp=1541425665699, value=\x00\x00\x00\x00\x00\x00\x00\x08
\x00\x00\x01 column=name:metrics, timestamp=1531479245132, value=mytest.cpu
\x00\x00\x01 column=name:tagk, timestamp=1531479245162, value=host
\x00\x00\x01 column=name:tagv, timestamp=1531479245189, value=server4
\x00\x00\x02 column=name:metrics, timestamp=1535891521172, value=metric-t
\x00\x00\x02 column=name:tagk, timestamp=1535891521198, value=chl
\x00\x00\x02 column=name:tagv, timestamp=1531479264404, value=server5
\x00\x00\x03 column=name:metrics, timestamp=1535982247205, value=csdn
\x00\x00\x03 column=name:tagk, timestamp=1535982247230, value=accessNumber
\x00\x00\x03 column=name:tagv, timestamp=1531485413194, value=s485276
\x00\x00\x04 column=name:metrics, timestamp=1541426336083, value=test
\x00\x00\x04 column=name:tagv, timestamp=1535891521217, value=hqdApp
\x00\x00\x05 column=name:metrics, timestamp=1541500656917, value=test_meta
\x00\x00\x05 column=name:tagv, timestamp=1535982247253, value=cs
\x00\x00\x06 column=name:tagv, timestamp=1537103490275, value=Firminal
\x00\x00\x07 column=name:tagv, timestamp=1541425665353, value=lawson
\x00\x00\x08 column=name:tagv, timestamp=1541425665725, value=firminal
Firminal column=id:tagv, timestamp=1537103490289, value=\x00\x00\x06
accessNumber column=id:tagk, timestamp=1535982247235, value=\x00\x00\x03
chl column=id:tagk, timestamp=1535891521203, value=\x00\x00\x02
cs column=id:tagv, timestamp=1535982247259, value=\x00\x00\x05
csdn column=id:metrics, timestamp=1535982247213, value=\x00\x00\x03
firminal column=id:tagv, timestamp=1541425665756, value=\x00\x00\x08
host column=id:tagk, timestamp=1531479245177, value=\x00\x00\x01
hqdApp column=id:tagv, timestamp=1535891521224, value=\x00\x00\x04
lawson column=id:tagv, timestamp=1541425665366, value=\x00\x00\x07
metric-t column=id:metrics, timestamp=1535891521182, value=\x00\x00\x02
mytest.cpu column=id:metrics, timestamp=1531479245145, value=\x00\x00\x01
s485276 column=id:tagv, timestamp=1531485413204, value=\x00\x00\x03
server4 column=id:tagv, timestamp=1531479245192, value=\x00\x00\x01
server5 column=id:tagv, timestamp=1531479264407, value=\x00\x00\x02
test column=id:metrics, timestamp=1541426336086, value=\x00\x00\x04
test_meta column=id:metrics, timestamp=1541500656927, value=\x00\x00\x05
25 row(s) in 0.7650 seconds
- 使用
columnFamily
作為引數
public static void getRowByScan(String tableName, String columnFamily) {
try {
Table table = connection.getTable(TableName.valueOf(tableName));
ResultScanner resultScanner = table.getScanner(Bytes.toBytes(columnFamily));// get cf's data
for(Result res: resultScanner){
System.out.println(res);
}
} catch (IOException e) {
e.printStackTrace();
}
}
執行結果如下:
keyvalues={\x00\x00\x01/name:metrics/1531479245132/Put/vlen=10/seqid=0, \x00\x00\x01/name:tagk/1531479245162/Put/vlen=4/seqid=0, \x00\x00\x01/name:tagv/1531479245189/Put/vlen=7/seqid=0}
keyvalues={\x00\x00\x02/name:metrics/1535891521172/Put/vlen=8/seqid=0, \x00\x00\x02/name:tagk/1535891521198/Put/vlen=3/seqid=0, \x00\x00\x02/name:tagv/1531479264404/Put/vlen=7/seqid=0}
keyvalues={\x00\x00\x03/name:metrics/1535982247205/Put/vlen=4/seqid=0, \x00\x00\x03/name:tagk/1535982247230/Put/vlen=12/seqid=0, \x00\x00\x03/name:tagv/1531485413194/Put/vlen=7/seqid=0}
keyvalues={\x00\x00\x04/name:metrics/1541426336083/Put/vlen=4/seqid=0, \x00\x00\x04/name:tagv/1535891521217/Put/vlen=6/seqid=0}
keyvalues={\x00\x00\x05/name:metrics/1541500656917/Put/vlen=9/seqid=0, \x00\x00\x05/name:tagv/1535982247253/Put/vlen=2/seqid=0}
keyvalues={\x00\x00\x06/name:tagv/1537103490275/Put/vlen=8/seqid=0}
keyvalues={\x00\x00\x07/name:tagv/1541425665353/Put/vlen=6/seqid=0}
keyvalues={\x00\x00\x08/name:tagv/1541425665725/Put/vlen=8/seqid=0}
可以看到程式碼中的一個res其實是一個 Keyvalues
,因為同行中的資料不等,於是得到的總資料就是8行。
- 使用Scan作為引數
public static void getRowByScan(String tableName) {
try {
Table table = connection.getTable(TableName.valueOf(tableName));
Scan scan = new Scan();
scan.setStartRow("server4".getBytes());
ResultScanner resultScanner = table.getScanner(scan);// get cf's data
for(Result res: resultScanner){
System.out.println(res);
}
} catch (IOException e) {
e.printStackTrace();
}
}
執行結果如下:
keyvalues={server4/id:tagv/1531479245192/Put/vlen=3/seqid=0}
keyvalues={server5/id:tagv/1531479264407/Put/vlen=3/seqid=0}
keyvalues={test/id:metrics/1541426336086/Put/vlen=3/seqid=0}
keyvalues={test_meta/id:metrics/1541500656927/Put/vlen=3/seqid=0}
- 使用
columnFamily,qualifier
作為引數
public static void getRowByScanThree(String tableName,String family,String qualifier) {
try {
Table table = connection.getTable(TableName.valueOf(tableName));
ResultScanner resultScanner = table.getScanner(family.getBytes(),qualifier.getBytes());// get cf's data
for(Result res: resultScanner){
System.out.println(res);
}
} catch (IOException e) {
e.printStackTrace();
}
}
執行結果如下:
keyvalues={\x00\x00\x01/name:metrics/1531479245132/Put/vlen=10/seqid=0}
keyvalues={\x00\x00\x02/name:metrics/1535891521172/Put/vlen=8/seqid=0}
keyvalues={\x00\x00\x03/name:metrics/1535982247205/Put/vlen=4/seqid=0}
keyvalues={\x00\x00\x04/name:metrics/1541426336083/Put/vlen=4/seqid=0}
keyvalues={\x00\x00\x05/name:metrics/1541500656917/Put/vlen=9/seqid=0}
2.2 輸出 Keyvalue
的值
上面的輸出將表中一整行的資料作為一個 Keyvalue
物件儲存,但是如何單獨取出 Keyvalue
中的值呢?比如說,我想取出rowKey=? value=? timestamp=?
等。程式碼如下:
public static void getRowValue(String tableName,String family,String qualifier) {
try {
Table table = connection.getTable(TableName.valueOf(tableName));
ResultScanner resultScanner = table.getScanner(family.getBytes(),qualifier.getBytes());// get cf's data
for(Result res: resultScanner){
//System.out.println(res);
for (KeyValue kv : res.raw()) {
byte []temp = new byte[]{};
temp = kv.getRow();
System.out.print("rowKey: ");
for(int i = 0;i<temp.length;i++){
System.out.print(temp[i]);
}
System.out.println(" value: "+Bytes.toString(kv.getValue()) +" timestamp: "+(kv.getTimestamp()));
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
執行結果
rowKey: 001, value: mytest.cpu, timestamp: 1531479245132
rowKey: 002, value: metric-t, timestamp: 1535891521172
rowKey: 003, value: csdn, timestamp: 1535982247205
rowKey: 004, value: test, timestamp: 1541426336083
rowKey: 005, value: test_meta, timestamp: 1541500656917
因為在表tsdb-uid
的 rowKey
是一個位元組陣列,所以無法將其直接轉為String
,於是在上面的程式碼裡,使用的是for()
迴圈輸出rowKey
。