Exercises 5: Search trees
- Due No due date
- Points None
Note: The weekly exercises are optional, but highly recommended. If you are uncertain about any problem or want to discuss, write on Slack (channel #exercise-session).
The exercises are divided into core and bonus. We recommend you try all the core exercises. Try some of the bonus exercises if you want to learn more.
Update: Solution suggestions.
Core exercises
-
Try the quiz.
-
Insert the values 2, 1, 4, 5, 9, 3, 6, 7 (in this order) into:
- A binary search tree
- A 2-3 tree
- An AVL tree
-
Insert the values 1, 2, 3, 4, 5, 6, 7, 8 (in this order) into:
- A binary search tree
- A 2-3 tree
- An AVL tree
-
Suppose we have a class
Set
which implements the Set ADT:class Set boolean add(Item item) ... boolean remove(Item item) ... boolean contains(Item item) ...
How would you add the following operations to the
Set
class?void union(Set other)
: After callings1.union(s2)
, s1 should contain the items that were previously in s1 ∪ s2.void intersection(Set other)
: After callings1.intersection(s2)
, s1 should contain the items that were previously in s1 ∩ s2.void difference(Set other)
: After callings1.intersection(s2)
, s1 should contain the items that were previously in s1 but not in s2.
Answer in code or pseudocode. You can answer this question without knowing how the Set is implemented, using only the
add
,remove
,contains
methods and the iterator.- Bonus part: Suppose that Set is implemented using a balanced BST. Show how to implement
intersection
anddifference
so that their runtime is O(n log(m)), where n is the size of the smaller of the two setsthis
andother
, and m is the size of the larger set. Hint: first check which of the sets is larger, then write code for that particular case.
-
Consider the following algorithm for sorting a list of keys:
- Add each key in turn to a BST.
- Do an in-order traversal of the BST to get the keys in ascending order.
-
To make this a correct sorting algorithm, how should we define the BST invariant and deal with duplicate keys?
-
What are the average-case and worst-case complexities of the algorithm?
- How can make it harder for an evil user to exploit the worst-case complexity if we have a source of randomness?
-
Design a data structure representing a multiset: a set where each key can appear multiple times. Think first about what operations you would like the data structure to support. Hint: use a map; you do not need to implement the map but can just use an existing implementation.
Also design a data structure representing a multimap: a symbol table (map) where each key can be associated with multiple values.
Bonus exercises
-
Design an algorithm that, given a BST and two keys x and y, finds all keys between x and y in the BST (for example, returning them in a queue). The runtime should be O(log(N) + M) for a balanced BST, where N is the size of the BST and M is the number of keys returned. Hint: combine ideas from BST search and in-order traversal.
-
Define a method
boolean isBST(Node node)
that takes a node and checks if the tree rooted at that node is a valid BST.The simplest way to define this method potentially visits each node many times, and takes quadratic time on unbalanced tree. Define a method
boolean isBSTBetween(Node node, Key min, Key max)
that checks that a tree is a BST, and that all keys in it are> min
and< max
. The argumentsmin
andmax
can benull
/None
, in which case that part of the check is ignored. The method should only visit each node in the tree once. Use this method to define a linear-timeisBST
method. -
Design a data structure representing a bidirectional map: a symbol table where there is a one-to-one relationship between key and values - each key is associated with one value, and each value is associated with one key. In addition to the usual symbol table operations it should provide a “reverse lookup” operation which takes a value and finds the corresponding key (efficiently). When inserting a key-value pair, any conflicting key-value pair should be removed. Hint: use two balanced BSTs. The code for adding a new key-value pair is a little tricky, so make sure to write down an invariant relating the two BSTs.
-
Describe an algorithm that finds the successor of a key in a BST: the key that comes directly after it in sorted order. The runtime should be proportional to the height of the BST. Hint: make the algorithm return null if the key is the largest one. The algorithm is recursive (similar to the search algorithm) but there are extra cases depending on whether each recursive call returns null or not.
-
Suppose that we have a binary search tree where each node also contains a reference to its parent node. Design an algorithm that performs an in-order traversal of the tree (e.g. printing out the nodes in ascending order), but using a constant amount of extra memory. Note that the normal recursive algorithm for in-order traversal requires O(height of tree) extra memory because each recursive call consumes memory.
-
Learn about how 2-3 tree deletion works by reading these notes: http://www.cs.princeton.edu/~dpw/courses/cos326-12/ass/2-3-trees.pdf.
-
Hard, very optional extra bonus question. A 2-3-4 tree consists of 2-nodes, 3-nodes and 4-nodes (a node with three values and four children). Insertion in 2-3-4 trees uses a more efficent insertion algorithm called top-down insertion. In top-down insertion, as you move down the tree looking for the insertion point, whenever you reach a 4-node you split it and absorb it into its parent (note that because we split all 4-nodes on the way down, the parent is always a 2- or 3-node so absorption is easy). When we reach the leaf, we split it and insert the appropriate value.
- Describe in more detail the algorithm for inserting into a 2-3-4 tree using top-down insertion.