Red-Black Trees & B-Trees: Structure and Operations
Red-Black Tree Fundamentals
A Red-Black Tree is a self-balancing binary search tree (BST) used in computer science for efficient data storage and retrieval. Each node in a Red-Black Tree has an extra property, its color (either red or black), which helps maintain the tree’s balance properties.
Properties of Red-Black Trees
- Each node is either red or black.
- The root and all NIL (leaf) nodes are black.
- The standard BST property holds: left child < node < right child.
- No red node has a red child (no two consecutive red nodes on a path).
- All paths from any given node to its descendant NIL nodes contain the same number of black nodes (the black-height property).
Why Use Red-Black Trees?
- Red-Black Trees maintain efficient insertion, deletion, and lookup operations with a time complexity of O(log n) due to their self-balancing nature.
- They are ideal for scenarios requiring frequent insertions and deletions, such as in databases and file systems.
Red-Black Tree Node Structure
Each node typically contains: a value, a color (RED or BLACK), a pointer to the left child, a pointer to the right child, and a pointer to the parent node.
Red-Black Tree Insertion Algorithm
- Insert as in a regular BST: Add the new node following standard BST insertion rules, initially coloring it red.
- Fix Red-Black Violations: After insertion, check if any Red-Black properties are violated. If the new node’s parent is black, no action is needed. If the parent is red (violating the “no red node has a red child” property), perform the following fixes based on the uncle’s color:
- Case 1: Uncle is Red
- Recolor the parent and uncle to black.
- Recolor the grandparent to red.
- Move up to the grandparent and repeat the fixing process if necessary.
- Case 2: Uncle is Black (or NIL)
- If the new node forms a “zigzag” configuration (e.g., parent is left child, node is right child), perform a rotation (left rotation on the parent) to transform it into a straight “line” configuration (left-left).
- Perform a rotation at the grandparent (e.g., right rotation for left-left configuration) to balance the tree.
- Recolor the nodes involved (typically the original grandparent and its new parent) to maintain Red-Black properties.
- Case 1: Uncle is Red
- Ensure Root is Black: After all adjustments, ensure the root node is always black.
Complexity Analysis (Red-Black Tree)
- Insertion: O(log n) worst-case
- Search: O(log n) worst-case
- Deletion: O(log n) worst-case
Red-Black Tree Deletion Algorithm
Deletion involves standard BST deletion followed by fixing potential Red-Black property violations, often involving a concept called “Double Black” (DB) when a black node is removed or replaced.
- Case 1: Node to delete is a red leaf: Simply remove it.
- Case 2: DB is the root: Remove the DB property; the root remains black.
- Case 3: DB’s sibling is black, and both sibling’s children are black:
- Make the sibling red.
- Move the DB property up to the parent. If the parent was red, it becomes black, resolving DB. If the parent was black, it becomes DB, and the process repeats.
- Case 4: DB’s sibling is red:
- Swap the colors of the parent and the red sibling.
- Rotate the parent towards the DB node.
- This transforms the situation into Case 5 or 6, so re-evaluate.
- Case 5: DB’s sibling is black, its far child is black, and its near child is red:
- Swap the colors of the sibling and its red (near) child.
- Rotate the sibling away from the DB node (towards the near child).
- This transforms the situation into Case 6.
- Case 6: DB’s sibling is black, and its far child is red:
- Swap the colors of the parent and the sibling.
- Rotate the parent towards the DB node.
- Make the sibling’s red (far) child black.
- Remove the DB property (the violation is resolved).
B-Tree Essentials
A B-Tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. It’s optimized for systems that read and write large blocks of data, like disk-based storage.
Properties of B-Trees
Let ‘m’ be the maximum number of children a node can have (the order of the B-Tree).- Every node has at most m children.
- Every internal node (except the root) has at least ⌈m/2⌉ children.
- The root node has at least two children unless it is a leaf node (the only node in the tree).
- All leaves appear on the same level.
- A non-leaf node with k children contains k−1 keys, sorted in ascending order.
B-Tree Node Structure
A node typically contains: an array of keys, an array of pointers to child nodes, and a boolean flag indicating if it is a leaf (isLeaf
).
B-Tree Insertion Algorithm
- If Tree is Empty: Create a root node and insert the key.
- Find the Target Leaf Node: Traverse down the tree, following the appropriate child pointers based on key comparisons, until a leaf node where the key should be inserted is reached.
- Insert Key into Leaf:
- If Leaf Node is Not Full: Insert the key into the leaf node while maintaining sorted order.
- If Leaf Node is Full:
- Temporarily insert the new key into the full node in sorted order.
- Split the node: Find the median key.
- Promote the median key up to the parent node.
- Create two new nodes: one with keys less than the median (left child) and one with keys greater than the median (right child).
- If the parent node becomes full after receiving the median key, recursively split the parent. This process may propagate up to the root, potentially increasing the tree’s height.
B-Tree Degree and Node Size
The minimum degree ‘t’ (where t ≥ 2) is often used to define B-Tree properties:
- Minimum number of keys in a non-root node: t – 1
- Maximum number of keys in any node: 2t – 1
- Minimum number of children for a non-root internal node: t
- Maximum number of children for any node: 2t (Note: This relates to m, often m = 2t)
B-Tree Deletion Algorithm
- Locate the Key: Find the node containing the key to be deleted.
- Case I: Key is in a Leaf Node
- If the leaf has more than the minimum number of keys (t-1): Simply remove the key.
- If the leaf has the minimum number of keys: Try to borrow a key from an adjacent sibling (left or right) if the sibling has more than the minimum keys. This involves moving a key from the sibling to the parent and a key from the parent down to the deficient leaf.
- If neither sibling can spare a key: Merge the deficient leaf with one of its siblings. This involves moving a key down from the parent into the merged node. The parent node might become deficient, requiring recursive fixing up the tree.
- Case II: Key is in an Internal Node
- If the child preceding the key (left child) has at least ‘t’ keys: Replace the key with its in-order predecessor (the largest key in the left child’s subtree) and recursively delete the predecessor from its original leaf node.
- If the child succeeding the key (right child) has at least ‘t’ keys: Replace the key with its in-order successor (the smallest key in the right child’s subtree) and recursively delete the successor from its original leaf node.
- If both preceding and succeeding children have only the minimum number of keys (t-1): Merge the left child, the right child, and the key from the parent node into a single node. Then, recursively delete the key from this new merged node (which will now be in a leaf or handled by Case I/II recursively). This merge might cause the parent node to become deficient, requiring recursive fixing.
- Case III: Tree Height Shrinks
- If merging nodes causes the root node to have zero keys but one child, the child becomes the new root, decreasing the tree’s height by one.
Complexity Analysis (B-Tree)
- Search: O(log n)
- Insertion: O(log n)
- Deletion: O(log n)