Algorithm Design and Analysis: Greedy Methods, Divide & Conquer, and Recurrences
Greedy Method
The greedy method involves making locally optimal choices at each step, aiming for a globally optimal solution. It’s crucial to prove the correctness of a greedy algorithm.
Event Scheduling
Problem: Design an algorithm to select the maximum number of non-overlapping events, given n events with start and end times.
Greedy Choice: Pick the next available event with the earliest finish time.
Solution Format: List of events.
Constraints: No overlapping events.
Objective: Maximize the number of events.
Implementation:
- Initialize a queue S.
- Sort events by finish time (O(n log n)).
- Add the first event E1 to S.
- Set F = f1 (finish time of E1).
- For i = 2 to n: If si ≥ F (start time of Ei is greater than or equal to F): enqueue(Ei, S) and set F = fi.
- Return S (O(n)).
Exchange Argument Template:
- State what we know: Definition of the greedy choice (g). The optimal solution (OS) meets constraints.
- Define OS’ from OS and g (usually by exchanging g with another choice).
- Prove that OS’ meets constraints. Use steps 1 and 2.
- Compare the value/cost of OS’ to OS. Use step 2, sometimes step 1.
Induction:
- Let g be the first greedy decision. Let I’ be the rest of the problem given g.
- Greedy Solution (GS) = g + GS(I’).
- OS is any legal solution.
- OS’ is defined from OS by the exchange argument (if OS does not include g).
- OS’ = g + some solution on I’.
- Induction: GS(I’) is at least as good as any solution on I’.
- GS is at least as good as OS’, which is at least as good as OS.
Greedy Stays Ahead
Consider an input I with n events E1, …, En. Let OS(I) = {J1, J2, …, Jk} be an arbitrary set of non-conflicting events. Let GS(I) = {G1, G2, …, Gℓ} be the outcome of the greedy strategy. We want to show that k ≤ ℓ.
Claim: GS “stays ahead” of OS: Finish(Gi) ≤ Finish(Ji) for all i ≥ 1.
Proof: (by induction on i)
Base Case: Finish(G1) ≤ Finish(J1) by the greedy choice.
Inductive Hypothesis: For some i ≥ 1, assume that Finish(Gi) ≤ Finish(Ji).
Inductive Step: We want to show that Finish(Gi+1) ≤ Finish(Ji+1). Proof by contradiction. Suppose k > ℓ, where |OS| = k and |GS| = ℓ. Then Gℓ is the last greedy choice, so there are no events that start after Gℓ finishes. Thus, Finish(Gℓ) ≤ Finish(Jℓ) and Finish(Jℓ) ≤ Start(Jℓ+1). This implies there is an event Jℓ+1 that starts after the last greedy choice, which is impossible.
Kruskal’s Algorithm Exchange Argument
Greedy Choice (g): The lightest edge.
Let OS be a spanning tree that does not include g. Create OS’ by adding g to OS (creating a cycle) and deleting the heaviest edge h in that cycle. OS’ is a spanning tree: it is a tree and spans all vertices. TotalWeight(OS’) ≤ TotalWeight(OS) since TW(OS’) = TW(OS) + w(g) – w(h).
Base Case: Trivial for a graph with n = 1 vertex.
Induction: Assume Kruskal’s algorithm is optimal for graphs with n-1 vertices. For an n-vertex graph G, add g to any spanning tree and remove the heaviest edge to keep the tree minimal. The resulting Minimum Spanning Tree (MST) has equal or lesser weight, proving optimality by induction.
Divide and Conquer
Multiplying Binary Numbers
: x=2n/2xL+xR, y=2n/2yL+yR. xy=2nxLyL+2n/2(xLyR+xRyL)+xRyR. Algorithm multiply: function multiply (x,y). Input: n-bit integers x and y. Output: the product xy. If n=1: return xy O(1). 𝑥L , xR and 𝑦L, 𝑦R are the left-most and right-most n/2 bits of x and y O(n). P1 = 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲(𝑥L , 𝑦L)P2= 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲(𝑥L , 𝑦R)𝑃3 = 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲(xR,yL)𝑃4 = 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲(xR,YR) (P1-P4 are all T(n/2)). return P1∗2n + (𝑃2 + 𝑃3)∗2n/2+P4 O(n). T(n)= 4T(n/2)+O(n). Algorithm multiply KS:function multiplyKS (x,y). Input: n-bit integers x and y. Output: the product xy. If n=1: return xy O(1). 𝑥L , xR and 𝑦L, 𝑦R are the left-most and right-most n/2 bits of x and y O(n). R1 = 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲KS(𝑥L , 𝑦L).R2= 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲KS(𝑥L , 𝑦R).R3 = 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐲KS((xL+xR)(yL+yR)). return R1∗2n+(R3-R1-R2))∗2n/2+R2. T(n)= 3TKS(n/2)+O(n)
Master Theorem: If 𝑇(𝑛) = 𝑎𝑇(𝑛/𝑏) + 𝑂(𝑛d) for some constants 𝑎 > 0, 𝑏 > 1, 𝑑 ≥ 0 , Then 𝑇(𝑛)ε: 1. 𝑂(𝑛d) 𝑖𝑓 𝑎 d. 2. 𝑂(𝑛dlogn)if 𝑎 = 𝑏d. 3. 𝑂(𝑛logb a) if a>bd. Standard k2k^2k2 Approach: T(n) = k2 T(n/k) + O(n) → O(n2). Cook-Toom Algorithm: T(n)=(2k−1)T(n/k)+O(n), → O(nlogk(2k−1)) time. Best Case: Approaches O(nlogn)in theory as k→∞k, but practical values are limited to smaller k due to increased overhead.
QuickSelect Algorithm:input: list of integers and integer k. Output: the kth smallest number in the set of integers. function QuickSelect(a[1…n],k) if n==1: return a[1]. pick a random integer in the list v. Split the list into sets SL, Sv, SR. if k≤|SL|: return QuickSelect(SL,k). if k≤|SL|+|Sv|:return v. else: return QuickSelect(SR, k-|SL|-|Sv|). Expected since split in half: T(n)=T(n/2)+O(n) which gives T(n)=O(n) by the Master Theorem. Worse Case when max or min: T(n)=T(n−1)+O(n), leading to O(n2).
Practice:(a) [TRUE or FALSE] O(n 1.585) is the best runtime possible for multiplication of two n-bit integers. Solution: False, we mentioned in class that the best known algorithm runs in O(n log n) time. It was designed in 2019. (b) Suppose T(n) = 3T(n/2) + cn and S(n) = 4S(n/6) + cn with T(1) = S(1) = c. What can you say about T(n) and S(n)? Is T(n) = O(S(n))? Or is S(n) = O(T(n))? Solution: By the master theorem: aT = 3, bT = 2, dT = 1, so aT > bdT T , so T(n) = Θ(n logbT (aT)) = Θ(n1.585) . aS = 4, bS = 6, dS = 1, so aS dS S , so T(n) = Θ(ndS)= Θ(n). We can conclude that S(n) = O(T(n)). (c) Suppose T(n) = T(n/3) + O( √ n). Use Master Theorem to solve this recurrence. Solution: By the master theorem: a = 1, b = 3, d = 1/2, so a d , so T(n) = O n d = O ( √ n). (d) Suppose T(n) = 9T(n/3) + O(n 2 ). Use Master Theorem to solve this recurrence. Solution: By the master theorem: a = 9, b = 3, d = 2, so a = bd , so T(n) = O(ndlog(n)) = O(n2 log(n)). (e) Suppose T(n) = 2T( √ n) + O(log2 (n)). Solve this recurrence. (Hint: substitute n = 2m and use master theorem.) Solution: Substitute n = 2m so that the recursion is: T(2m) = 2T( √ 2m) + O(log2 (2m)) T(2m) = 2T(2m/2 ) + O(m) Then consider the function: S(m) = T(2m). Then S(m) = 2S(m/2) +O(m) and by the master theorem with a = 2, b = 2, d = 1, notice that a = bd so S(m) = O(m log m) = T(2m). Then substituting back m = log2 (n), we get: T(n) = O(log2 (n) log2 (log2)))
Writing algorithm: 2. Suppose that you and your friends are going to hike the John Muir trail this summer. You want to hike as much as possible per day but, you do not want to hike after dark. On a map there are a large set of good stopping points for camping and you design your trip based on the following system: Every time you come to a potential stopping point, determine whether you can make it to the next one before nightfall. If you can make it then keep hiking, otherwise stop and camp. You claim that with this strategy you will minimize the number of camping stops you must make. (a) Suppose the stopping points are located at distances x1, . . . , xn from the trailhead. Also assume that your group can hike d distance per day (independent of terrain, weather conditions and so forth.) Determine:Instance: (x1, . . . , xn) the locations of the stopping points starting from the trail head and d, the maximum distance you can travel in a day. Solution Format: [p1, . . . , pk] the list of locations you will camp at. Constraints: pj+1 − pj ≤ d for each j. Objective: Minimize k, the number of stops
(b) Prove this strategy is optimal.Exchange Argument: For some input I = (x1, . . . , xn; d), let g be the first greedy location. Let OS = [c1, . . . , ck be some solution to I that does not include stopping at location g. Then the first stop of OS stops at point c1. Then by the nature of the greedy choice, you cannot stop any farther than g on your first day, so c1
OS’ is valid: We must show that we can always travel between two consecutive stopping points within one day. Suppose that cj is the stopping point in OS that occurs right before g, then we must show that we can get from g to cj+1. By the validity of OS, we know that cj+1 − cj ≤ d and since cj nduction part: Claim: For any input of any size n ≥ 1, the greedy solution is optimal. Base Case: For n = 1, stop at the only stopping point which will be the end of the trail. This is optimal Inductive Hypothesis: Suppose that for some n > 1, the greedy strategy is optimal for all inputs of size k such that 1 ≤ k (c)(c) Implement and determine runtime. Scan through the locations until one of them exceeds d. Camp at the first stopping point just before the distance d. Then continue from the stopping point in the same manner. (Note that this implementation assumes that there is in fact a solution.) procedure Camps(x1, . . . , xn; d) output = [ ] i = 1 start = 0 while i ≤ n: while xi − start
6. Consider the following algorithm called 5-Sort: On an input of a list of n integers:1. If the list has 4 or fewer elements then just bubble sort the list. 2. Otherwise, split the list into 5 equal parts (If 5 does not divide the size of the input, then thenumber of elements of the last part may be different than 5.) • Recursively sort the first 3/5 of the parts.• Recursively sort the last 3/5 of the parts.• Recursively sort the first 3/5 of the parts again. (a) Give a proof of correctness for this algorithm or show that it is not correct with a counterexample.This algorithm is not correct. Consider the input: (5, 4, 3, 2, 1). Then the algorithm performs the following operations: (5, 4, 3, 2, 1) ([sort(5, 4, 3)], 2, 1) (3, 4, 5, 2, 1)(3, 4, [sort(5, 2, 1)])(3, 4, 1, 2, 5)([sort(3, 4, 1)], 2, 5)(1, 3, 4, 2, 5). (b) Write a recurrence relation for the runtime of this algorithm and solve it using the master theorem. (assume that splitting the list takes O(n)time) Solution: This algorithm does 3 recursive calls, each of length 3n/5 and the non-recursive part takes O(n)time. So, a = 3, b = 5/3,d=1. Since a > bd,(3 > (5/3)1), by the master theorem: T(n) = O(nlog5/33)≈ O(n2.15)