記一次水平分表實踐（sharding-jdbc）

阿新 • • 發佈：2019-10-10

摘要

本文示例是按月水平分表。存在一下兩點不足：

分表主鍵沒有設計好，本文用的是自增長id，沒有把時間組合到主鍵中，導致少了一個只根據主鍵查詢的場景；
表中沒有冗餘一個專門用來分表的欄位，將分表字段跟業務欄位耦合了，導致一些細節問題。比如，本文的create_time 是帶毫秒的，一些時間加減操作會丟失毫秒導致查不到資料。

限於團隊規模，沒有做讀寫分離。

實踐

背景

目前我們支付訂單中心流水錶有2400w資料（mysql單表），查詢速度非常慢，且以每天20w+的速度在增長。考慮到這個資料量（每個月600w資料），我們打算按月分表，這樣每張表600w+資料量，比較適合查詢。

設計思路

將2019年11月份之前的資料都存放在預設的表中（imass_order_record），這樣做有一個好處，就是不用遷移任何歷史資料。在這之後的資料，按月建表。比如2019年11月11號的資料進imass_order_record_201911這張表，2019年12月11號的資料寫進imass_order_record_201912這張表。

這裡在做資料查詢的時候稍微注意“月切”問題。

分表策略

準確分表策略

package com.imassbank.unionpay.sharding;

import java.text.ParseException;
import java.time.LocalDate;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.util.Collection;
import java.util.Date;
import java.util.Locale;

import org.apache.commons.lang3.time.DateUtils;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingAlgorithm;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingValue;

import lombok.extern.slf4j.Slf4j;

/**
 * @author Michael Feng
 * @date 2019年9月19日
 * @description
 */
@Slf4j
public class DatePreciseShardingAlgorithm implements PreciseShardingAlgorithm<Date> {
	private static DateTimeFormatter sdf = DateTimeFormatter.ofPattern("yyyyMM", Locale.CHINA);
	private static final String SEPERATOR = "_";//表名分隔符
	private static Date  lowwerDate = null;
	
	static {
		try {
			lowwerDate = DateUtils.parseDate("201911", "yyyyMM");
		} catch (ParseException e) {
			log.error("解析其實日期異常",e);
		}
	}

	@Override
	public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<Date> shardingValue) {
		String loginTableName = shardingValue.getLogicTableName();
		Date createTime = shardingValue.getValue();
		if(createTime == null || createTime.before(lowwerDate) ){
			log.info("建立時間為空，或者當前時間:{} 小於 2019-11 ，進入預設表",createTime);
			return loginTableName;
		}
		String yyyyMM = "";
		try{
			yyyyMM =SEPERATOR+ createTime.toInstant().atZone(ZoneId.systemDefault()).toLocalDate().format(sdf);
			log.info("進入表：{}",loginTableName+yyyyMM);
			return loginTableName+yyyyMM; 
		}catch(Exception e){
			log.error("解析建立時間異常，分表失敗，進入預設表",e);
		}
		return loginTableName;
	}

}

範圍查詢策略

package com.imassbank.unionpay.sharding;

import java.text.ParseException;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.util.Collection;
import java.util.Date;
import java.util.Locale;
import java.util.concurrent.atomic.AtomicInteger;

import org.apache.commons.lang3.time.DateUtils;
import org.apache.shardingsphere.api.sharding.standard.RangeShardingAlgorithm;
import org.apache.shardingsphere.api.sharding.standard.RangeShardingValue;

import com.google.common.collect.Range;
import com.google.common.collect.Sets;

import lombok.extern.slf4j.Slf4j;

/**
 * @author Michael Feng
 * @date 2019年9月19日
 * @description
 */
@Slf4j
public class DateRangeShardingAlgorithm implements RangeShardingAlgorithm<Date> {
	private static DateTimeFormatter sdf = DateTimeFormatter.ofPattern("yyyyMM", Locale.CHINA);
	private static final String SEPERATOR = "_";//表名分隔符
	private static Date  lowwerDate = null;
	
	static {
		try {
			lowwerDate = DateUtils.parseDate("201911", "yyyyMM");
		} catch (ParseException e) {
			log.error("解析其實日期異常",e);
		}
	}
	
	@Override
	public Collection<String> doSharding(Collection<String> availableTargetNames,
			RangeShardingValue<Date> shardingValue) {
		Collection<String> tableSet = Sets.newConcurrentHashSet();
		String logicTableName = shardingValue.getLogicTableName();
		Range<Date> dates = shardingValue.getValueRange();
		Date lowDate = dates.lowerEndpoint();
		Date upperDate = dates.upperEndpoint();
		AtomicInteger i = new AtomicInteger(0);
		while(DateUtils.addMonths(lowDate, i.get()).compareTo(upperDate)<=0){
			Date date = DateUtils.addMonths(lowDate, i.getAndAdd(1));
			if(date.before(lowwerDate)){//早於其實日期的，都從預設的表裡面找
				tableSet.add(logicTableName);
			}else{
				tableSet.add(logicTableName+SEPERATOR+date.toInstant().atZone(ZoneId.systemDefault()).toLocalDate().format(sdf));
			}
		}
		return tableSet;
	}

}

分表配置

#資料來源
spring.shardingsphere.datasource.names=imassunionpay

#預設資料來源
spring.shardingsphere.sharding.default-data-source-name=imassunionpay

# 顯示sql
spring.shardingsphere.props.sql.show=true

#imassunionpay資料來源配置
spring.shardingsphere.datasource.imassunionpay.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.imassunionpay.driver-class-name=com.mysql.cj.jdbc.Driver
spring.shardingsphere.datasource.imassunionpay.url=jdbc:mysql://****:3306/imass_union_pay?useUnicode=true&characterEncoding=utf8&autoReconnect=true&allowMultiQueries=true&serverTimezone=Asia/Shanghai  
spring.shardingsphere.datasource.imassunionpay.username=root
spring.shardingsphere.datasource.imassunionpay.password=**


#範圍水平分表
spring.shardingsphere.sharding.tables.imass_order_record.table-strategy.standard.sharding-column=create_time
spring.shardingsphere.sharding.tables.imass_order_record.table-strategy.standard.precise-algorithm-class-name=com.imassbank.unionpay.sharding.DatePreciseShardingAlgorithm
spring.shardingsphere.sharding.tables.imass_order_record.table-strategy.standard.range-algorithm-class-name=com.imassbank.unionpay.sharding.DateRangeShardingAlgorithm



#druidDataSource
spring.shardingsphere.datasource.druid.initialSize=5    
spring.shardingsphere.datasource.druid.minIdle=5    
spring.shardingsphere.datasource.druid.maxActive=20    
spring.shardingsphere.datasource.druid.maxWait=60000    
spring.shardingsphere.datasource.druid.timeBetweenEvictionRunsMillis=60000    
spring.shardingsphere.datasource.druid.minEvictableIdleTimeMillis=300000    
spring.shardingsphere.datasource.druid.validationQuery=SELECT 1 FROM DUAL    
spring.shardingsphere.datasource.druid.testWhileIdle=true
spring.shardingsphere.datasource.druid.testOnBorrow=false    
spring.shardingsphere.datasource.druid.testOnReturn=false    
spring.shardingsphere.datasource.druid.poolPreparedStatements=true    
spring.shardingsphere.datasource.druid.maxPoolPreparedStatementPerConnectionSize=20    
spring.shardingsphere.datasource.druid.filters=stat,wall,cat

增刪改查

增

插入很簡單，只需要帶上分表主鍵create_time即可

刪改查

這三個操作都要帶上分表主鍵create_time，舉幾個場景：

帶了分表主鍵的。有的是直接帶了分表主鍵的，比如剛插入的資料，接下來要一些更新，直接帶上分表主鍵即可，但是更多的是時間範圍查詢，這種查詢會用到範圍查詢策略。
根據業務主鍵去查（比較好的方法是在業務主鍵裡面融入時間）
根據不帶分表主鍵的業務資料查詢。如果業務資料能關聯到時間，則把這個時間（放大範圍）當做分表主鍵去查。如果業務資料沒有任何時間屬性，則要集合業務特性做一些取捨，限定時間範圍。舉例如下：

	/**
	 * 只能查最近一個月的資料
	 */
	@Override
	public List<ImassOrderRecord> queryOrderRecordByOrderId(String orderId) {
		if(StringUtils.isEmpty(orderId)){
			logger.info("支付訂單號為空");
			return null;
		}
		Date endCreateTime = new Date();
		Date startCreateTime = DateUtils.truncate(DateUtils.addMonths(endCreateTime, -1),Calendar.DAY_OF_MONTH);
		List<ImassOrderRecord> recordList = orderRecordExtendMapper.queryOrderRecordByOrderId(orderId,startCreateTime,endCreateTime);
		SensitiveProcessor.decryptList(recordList);
		return recordList;
	}

這裡可以根據業務場景做更大時間跨度的查詢。

一般業務量大的時候，會做一個讀寫分離。資料寫入到分庫分表的資料庫，做持久化。同事將需要查詢的資料往es這種搜尋引擎寫一份，這樣在搜尋引擎裡面可以隨便查。

踩過的坑

Cannot support multiple schemas in one SQL

這個問題sharding-jdbc官方說過，不支援多schema。看了一下原始碼，是在解析sql的表的時候，比較了各個表的schema，不同則丟擲這個異常。實際上，查詢語句跟分表毫無關係的話，應該是可以支援這種多schema的。後期對原始碼理解更深入的時候，看看能不能參考強制路由的思路，允許應用選擇是否做sql解析。

範圍查詢sql必須是between and，不能 create_time > * and create_time <

這種語句不會呼叫到範圍查詢策略。

還有一些其它的坑，有點忘了。

後記

如果想要看看sharding-jdbc支援那些操作，可以看看這篇部落格。Sharding-Sphere資料分庫分表實踐(垂

記一次水平分表實踐（sharding-jdbc）

記一次spyder打不開（閃退）之後，心累的恢復歷程

[資料庫]-----記一次mysql分庫的操作（冷熱分離）

記一次python分散式web開發（利用docker）

記一次MYSQL建表失敗得bug（暫未解決！，求助）

[Windows10]記一次修復註冊表相關血案：該文件沒有與之關聯的應用來執行該操作。請安裝應用，若已經安裝應用，請在“默認應用設置”頁面中創建關聯。

記一次資料庫分表的初體驗！

記一次使用JavaIO下載瀏覽器（火狐）顯示檔名亂碼問題

記一次 VUE 專案優化實踐

軟工實踐 - 第十一次作業 Alpha 沖刺（3/10）

記一次Java動態代理實踐

[MySQL] 記一次MGR組複製GTID（1236）異常的解決

記一次WMS的系統改造（1）-分析問題

記一次銳捷網路虛擬化（VSU）故障處理

記一次隨機森林小實踐

阿里雲：記一次窮途末路的重灌（CentOs）

記一次追溯業務表被刪除的原因

記一次Oracle分割槽表全域性索引重建的過程

記一次介面效能優化實踐總結：優化介面效能的八個建議

記一次mysql資料庫被勒索（中）

記一次水平分表實踐（sharding-jdbc）

摘要

實踐

背景

設計思路

分表策略

準確分表策略

範圍查詢策略

分表配置

增刪改查

增

刪改查

踩過的坑

Cannot support multiple schemas in one SQL

範圍查詢sql必須是between and，不能 create_time > * and create_time <

後記

記一次水平分表實踐（sharding-jdbc）

摘要

實踐

背景

設計思路

分表策略

準確分表策略

範圍查詢策略

分表配置

增刪改查

增

刪改查

踩過的坑

Cannot support multiple schemas in one SQL

範圍查詢sql必須是between and，不能 create_time > * and create_time <

後記

相關推薦