海量資料測試,利用資料庫查詢拷貝快速構造測試資料
阿新 • • 發佈:2018-12-23
這也是OneCoder在資料測試過程中遇到的問題,不一定有多少普試性,但是也許可以解決你的問題。
海量資料測試,資料匯入一般是非常耗時的過程。OneCoder這裡面對大約2T左右資料的匯入問題,頭疼不已,時間有限。本來準備的方式是將事先生成好的資料檔案匯入HBase中,這裡有兩個比較耗時的過程,put到hdfs和import到HBase中,昨天測試5G資料匯入到HBase中,居然用了20min。如此估算,不能忍。
交流中,獲知可以考慮先匯入部分資料,然後利用這部分資料,在資料庫中查詢出來,然後修改比如時間欄位的值,然後在合併入表中。這樣,利用一天的資料,就可以構造出2、4,8,16天的資料。效率會高出很多。
當然,如果你的測試是基於真實的業務資料的,那麼這種方式不適合你。如果你是構造的資料,那麼可以嘗試一下這種方式。OneCoder基於MySQL寫了一個自動化的程式碼,作為參考。真實海量資料的一般可能是基於Hive的,只要修改連結串和可能少量的SQL語句即可。
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
/**
* 用於在MySQL中資料翻倍複製
*
* @author OneCoder(OneCoder)
* @date 2012-11-27 下午3:28:16
* @Blog http://www.coderli.com
*/
public class MySQLDataCopy {
private static final int COPY_COUNT = 9;
private static final long ONE_DAY_MILLISECOND = 3600 *24 *1000L;
private static final String SQL_CREATE_TABLE_ONE = "CREATE TABLE metric_t1 AS SELECT * FROM metric;";
private static final String SQL_CREATE_TABLE_TWO = "CREATE TABLE metric_t2 AS SELECT * FROM metric LIMIT 1;" ;
private static final String SQL_DELETE_DATA_TABLE_TWO = "DELETE FROM metric_t2;";
private static final String SQL_INSERT_DATA_TABLE_TWO_PREFIX = "INSERT INTO metric_t2 SELECT cpu, id, recordtime + ";
private static final String SQL_INSERT_DATA_TABLE_TWO_MIDDLE = " AS recordtime, CONCAT(id,\"_\",recordtime + ";
private static final String SQL_INSERT_DATA_TABLE_TWO_PRSTFIX = ") AS rowkey FROM metric;";
private static final String SQL_DELETE_DATA = "DELETE FROM metric;";
private static final String SQL_INSERT_DATA = "INSERT INTO metric SELECT * FROM (SELECT * FROM metric_t1 UNION ALL SELECT * FROM metric_t2) tmp;";
private static final String SQL_DROP_TABLE_ONE = "DROP TABLE metric_t1;";
private static final String SQL_DROP_TABLE_TWO = "DROP TABLE metric_t2;";
static {
try {
Class.forName("com.mysql.jdbc.Driver");
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
/**
* @param args
* @author OneCoder(OneCoder)
* @throws SQLException
* @date 2012-11-27 下午3:28:16
*/
public static void main(String[] args) throws SQLException {
Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/test", "root",
"root");
int times = 1;
for (int i = 1; i <= COPY_COUNT; i++) {
System.out.println("Begin to copy: " + i +" time");
conn.createStatement().execute(SQL_CREATE_TABLE_ONE);
System.out.println("Create table t1 finished");
conn.createStatement().execute(SQL_CREATE_TABLE_TWO);
System.out.println("Create table t2 finished");
conn.createStatement().execute(SQL_DELETE_DATA_TABLE_TWO);
System.out.println("Delete table t2 data");
conn.createStatement().execute(createInsertDataToTableTwoSQL(times));
System.out.println("Insert into table t2 new data.");
conn.createStatement().execute(SQL_DELETE_DATA);
System.out.println("Delete table metric data");
conn.createStatement().execute(SQL_INSERT_DATA);
System.out.println("Insert new data into metric");
conn.createStatement().execute(SQL_DROP_TABLE_ONE);
System.out.println("Drop table t1");
conn.createStatement().execute(SQL_DROP_TABLE_TWO);
System.out.println("Drop table t1");
times *= 2;
}
}
private static String createInsertDataToTableTwoSQL(int count) {
long addNum = count * ONE_DAY_MILLISECOND;
StringBuilder sbuilder = new StringBuilder();
sbuilder.append(SQL_INSERT_DATA_TABLE_TWO_PREFIX).append(addNum)
.append(SQL_INSERT_DATA_TABLE_TWO_MIDDLE).append(addNum)
.append(SQL_INSERT_DATA_TABLE_TWO_PRSTFIX);
return sbuilder.toString();
}
}
初學者,請勿見笑,多多指教:)