1. 程式人生 > >Design and Analysis of Algorithms (B-Trees)

Design and Analysis of Algorithms (B-Trees)

B-Trees

B-Trees are tree data structures that store sorted data. B-Trees can be seen as a generalization of Binary Search Trees where nodes can have more than one key/value and more than two children. Similar to BSTs, they support search, insertion, and deletion in logarithmic time.

1 Properties

A B-tree has a parameter called the minimum degree or branching factor. For the purposes of our discussion let the branching factor be B.

  • For any non-leaf node, the number of children is one greater than the number of keys in that node.
  • Every non-root node contains at least B − 1 keys. Consequently, all internal (non-leaf and non-root) nodes have at least B children.
  • Every node contains at most 2B −1 keys. Consequently, all nodes have at most 2B children.
  • All the leaves are at the same depth.

The keys is a B-tree are sorted in a similar fashion to BST. Consider a node x with C children. Let’s say that x has keys k1 < k2 < ... < kC. For ease of notation, we define k0 = ∞ and kn + 1 = −∞. If K belongs to the ith(1 ≤ i ≤ n + 1) sub-tree of x, then ki−1 ≤ K ≤ ki.

  • Search time is O(log(n))
  • Insert/Delete time is O(lg(n)) if B = O(1)

2 Why B-Trees

  • Caches read whole blocks of data, and want entire block useful
  • Set parameter B equal to block size
  • O(logb(n)) block reads per Search, Insert, Delete operations.

B-Trees are used by most databases and filesystems:

-Databases: Sleepycat/BerkelyDB, MySQL, SQLite

-Filesystems: MacOS HFS/HFS+, ReiserFS, Windows NTFS, Linux ext3, shmfs