1. 程式人生 > >使用hbase來解決上億條資料的準實時響應

使用hbase來解決上億條資料的準實時響應

使用hbase來解決億級資料的準實時響應


專案中的app行為日誌,使用者授權收集的通訊錄、通話記錄、簡訊和聯絡人資訊,隨著時間的推進,資料量進入億資料級,千萬級的建立索引,來加快查詢速度的優化方式,此時可能已經不起作用了。為解決信審階段實時的查詢請求,引入hbase來解決響應慢的問題。


When Should I Use HBase?
HBase isn’t suitable for every problem.


First, make sure you have enough data. If you have hundreds of millions or billions of rows, then HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS might be a better choice due to the fact that all of your data might wind up on a single node (or two) and the rest of the cluster may be sitting idle.


Second, make sure you can live without all the extra features that an RDBMS provides (e.g., typed columns, secondary indexes, transactions, advanced query languages, etc.) An application built against an RDBMS cannot be "ported" to HBase by simply changing a JDBC driver, for example. Consider moving from an RDBMS to HBase as a complete redesign as opposed to a port.


Third, make sure you have enough hardware. Even HDFS doesn’t do well with anything less than 5 DataNodes (due to things such as HDFS block replication which has a default of 3), plus a NameNode.


hbase並不適合解決所有的問題。首先要有足夠多的資料;其次,沒有關係型資料庫的特性(列型別,二級索引,事務,強大的查詢語言等 )業務可以正常進行;另外,確定有足夠的硬體,特別是HDFS沒有5臺DataNode和一個NameNode節點不會工作的很好。


專案通過新增一個大資料平臺來處理大流量,高併發,低延時的請求,資料一方面與hbase互動,另一方面進入資料處理匯流排kafka,與資料中心打通資料流。


 

深圳逆時針