Science Fair Project Encyclopedia
Scapegoat tree
A scapegoat tree is a self-balancing binary search tree which performs insertions and deletions in O(log(n)) amortized time by rebuilding the tree whenever it becomes too unbalanced. Each node in a scapegoat tree keeps track of the size and height of the subtree rooted at that node, as well as whether or not that node has been deleted. The tree also has to keep track of the total number of nodes deleted since the last rebuilding. It was invented by Igal Galperin and Ronald L. Rivest.
| Contents |
The insertion operation
Insertions are performed in the same manner as in a binary search tree. However after inserting a node v, each node along the path from the root to v must be updated to reflect its new height (h(v)) and size (|v|). If any node along that path should have h(v) > αlog(|v|) for some α > 1 then the subtree rooted at that node will be rebuilt. This node is referred to as the scapegoat, hence the name.
Sketch of proof for cost of insertion
Define the Imbalance of a node v to be the absolute value of the difference in size between it's left node and right node minus 1, or 0, whichever is greater. In other words:
I(v) = max(|left(v) - right(v)| - 1, 0)
Immediately after rebuilding a subtree rooted at v, I(v) = 0.
Lemma: Immediately before rebuilding the subtree rooted at v, I(v) = Ω(|v|)
Proof of lemma:
Let v0 be the root of a subtree immediately after rebuilding. h(v0) = log(|v0| + 1). If there are Ω(|v0|) degenerate insertions (that is, where each inserted node increases the height by 1), then I(v) = Ω(|v0|), h(v) = h(v0) + Ω(|v0|) and log(|v|) ≤ log(|v0| + 1) + 1.
Since I(v) = Ω(|v|) before rebuilding, there were Ω(|v|) insertions into the subtree rooted at v that did not result in rebuilding. Each of these insertions can be performed in O(log n) time. The final insertion that causes rebuilding costs O(|v|). Using aggregate analysis it becomes clear that the amortized cost of an insertion is O(log n):
The deletion operation
When a node is deleted, instead of actually being removed from the tree the node is flagged for deletion. When the number of deleted nodes is greater than or equal to the half the total number of nodes in the tree, the entire tree is rebuilt. Rebuilding the entire tree takes O(n) time.
Sketch of proof for cost of deletion
Suppose the scapegoat tree has n elements and has just been rebuilt (in other words, it's a complete binary tree). At most n/2 - 1 deletions can be performed before the tree must be rebuilt. Each of these deletions take O(log n) time (the amount of time to search for the element and flag it as deleted). The n/2 deletion causes the tree to be rebuilt and takes O(log n) + O(n) (or just O(n)) time. Using aggregate analysis it becomes clear that the amortized cost of a deletion is O(log n):
See also
External links
- The ACM Digital Library: the original publication describing scapegoat trees NB full access requires an ACM Web Account.
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details


