Free Essay

Graduate Student

In:

Submitted By yuancui
Words 49158
Pages 197
CPS 230

DESIGN AND ANALYSIS OF ALGORITHMS
Fall 2008

Instructor: Herbert Edelsbrunner Teaching Assistant: Zhiqiang Gu

CPS 230

Fall Semester of 2008

Table of Contents

1 I 2 3 4 5

Introduction D ESIGN T ECHNIQUES Divide-and-Conquer Prune-and-Search Dynamic Programming Greedy Algorithms First Homework Assignment S EARCHING

3 4 5 8 11 14 17 18 19 22 26 29 33 34 35 38 41 44

IV 13 14 15 16

G RAPH A LGORITHMS Graph Search Shortest Paths Minimum Spanning Trees Union-Find Fourth Homework Assignment T OPOLOGICAL A LGORITHMS 17 18 19 Geometric Graphs Surfaces Homology Fifth Homework Assignment G EOMETRIC A LGORITHMS 20 21 22 Plane-Sweep Delaunay Triangulations Alpha Shapes Sixth Homework Assignment NP-C OMPLETENESS 23 24 25 Easy and Hard Problems NP-Complete Problems Approximation Algorithms Seventh Homework Assignment

45 46 50 53 56 60 61 62 65 68 72 73 74 77 81 84 85 86 89 92 95

V

II 6 7 8 9

Binary Search Trees Red-Black Trees Amortized Analysis Splay Trees Second Homework Assignment P RIORITIZING

VI

III 10 11 12

Heaps and Heapsort Fibonacci Heaps Solving Recurrence Relations Third Homework Assignment

VII

2

1 Introduction
Meetings. We meet twice a week, on Tuesdays and Thursdays, from 1:15 to 2:30pm, in room D106 LSRC. Communication. The course material will be delivered in the two weekly lectures. A written record of the lectures will be available on the web, usually a day after the lecture. The web also contains other information, such as homework assignments, solutions, useful links, etc. The main supporting text is
TARJAN . Data Structures and Network Algorithms. SIAM, 1983.

Overview. The main topics to be covered in this course are I Design Techniques; II Searching; III Prioritizing; IV Graph Algorithms; V Topological Algorithms; VI Geometric Algorithms; VII NP-completeness. The emphasis will be on algorithm design and on algorithm analysis. For the analysis, we frequently need basic mathematical tools. Think of analysis as the measurement of the quality of your design. Just like you use your sense of taste to check your cooking, you should get into the habit of using algorithm analysis to justify design decisions when you write an algorithm or a computer program. This is a necessary step to reach the next level in mastering the art of programming. I encourage you to implement new algorithms and to compare the experimental performance of your program with the theoretical prediction gained through analysis.

The book focuses on fundamental data structures and graph algorithms, and additional topics covered in the course can be found in the lecture notes or other texts in algorithms such as
K LEINBERG AND TARDOS . Algorithm Design. Pearson Education, 2006.

Examinations. There will be a final exam (covering the material of the entire semester) and a midterm (at the beginning of October), You may want to freshen up your math skills before going into this course. The weighting of exams and homework used to determine your grades is homework midterm final 35%, 25%, 40%.

Homework. We have seven homeworks scheduled throughout this semester, one per main topic covered in the course. The solutions to each homework are due one and a half weeks after the assignment. More precisely, they are due at the beginning of the third lecture after the assignment. The seventh homework may help you prepare for the final exam and solutions will not be collected. Rule 1. The solution to any one homework question must fit on a single page (together with the statement of the problem). Rule 2. The discussion of questions and solutions before the due date is not discouraged, but you must formulate your own solution. Rule 3. The deadline for turning in solutions is 10 minutes after the beginning of the lecture on the due date.

3

I

D ESIGN T ECHNIQUES

2 3 4 5

Divide-and-Conquer Prune-and-Search Dynamic Programming Greedy Algorithms First Homework Assignment

4

2 Divide-and-Conquer
7 3 5 4 2 9 4 2 1 j

We use quicksort as an example for an algorithm that follows the divide-and-conquer paradigm. It has the reputation of being the fasted comparison-based sorting algorithm. Indeed it is very fast on the average but can be slow for some input, unless precautions are taken. The algorithm. Quicksort follows the general paradigm of divide-and-conquer, which means it divides the unsorted array into two, it recurses on the two pieces, and it finally combines the two sorted pieces to obtain the sorted array. An interesting feature of quicksort is that the divide step separates small from large items. As a consequence, combining the sorted pieces happens automatically without doing anything extra. void Q UICK S ORT (int ℓ, r) if ℓ < r then m = S PLIT(ℓ, r); Q UICK S ORT (ℓ, m − 1); Q UICK S ORT (m + 1, r) endif. We assume the items are stored in A[0..n − 1]. The array is sorted by calling Q UICK S ORT (0, n − 1). Splitting. The performance of quicksort depends heavily on the performance of the split operation. The effect of splitting from ℓ to r is:

i

7

3

5

4

2 i

1

4

2

9 j

2

3

5

4

2

1

4

7 m

9

Figure 1: First, i and j stop at items 9 and 1, which are then swapped. Second, i and j cross and the pivot, 7, is swapped with item 2.

Special cases (i) and (iii) are ok but case (ii) requires a stopper at A[r + 1]. This stopper must be an item at least as large as x. If r < n − 1 this stopper is automatically given. For r = n − 1, we create such a stopper by setting A[n] = +∞. Running time. The actions taken by quicksort can be expressed using a binary tree: each (internal) node represents a call and displays the length of the subarray; see Figure 2. The worst case occurs when A is already sorted.

9 7 1 5 2 1 2 1 1

• no item in A[m + 1..r] is smaller than x. Figure 1 illustrates the process with an example. The nine items are split by moving a pointer i from left to right and another pointer j from right to left. The process stops when i and j cross. To get splitting right is a bit delicate, in particular in special cases. Make sure the algorithm is correct for (i) x is smallest item, (ii) x is largest item, (iii) all items are the same. int S PLIT(int ℓ, r) x = A[ℓ]; i = ℓ; j = r + 1; repeat repeat i++ until x ≤ A[i]; repeat j-- until x ≥ A[j]; if i < j then S WAP(i, j) endif until i ≥ j; S WAP(ℓ, j); return j.

• no item in A[ℓ..m − 1] is larger than x;

• x = A[ℓ] is moved to its correct location at A[m];

Figure 2: The total amount of time is proportional to the sum of lengths, which are the numbers of nodes in the corresponding subtrees. In the displayed case this sum is 29.

In this case the tree degenerates to a list without branching. The sum of lengths can be described by the following recurrence relation: n T (n) = n + T (n − 1) =

i = i=1 n+1 . 2

The running time in the worst case is therefore in O(n2 ). In the best case the tree is completely balanced and the sum of lengths is described by the recurrence relation T (n) = n+2·T n−1 2 .

5

If we assume n = 2k − 1 we can rewrite the relation as U (k) = = = = = (2k − 1) + 2 · U (k − 1) (2k − 1) + 2(2k−1 − 1) + . . . + 2k−1 (2 − 1) k−1 i=0 k

By assumption on function R S PLIT, the probability for 1 each m ∈ [0, n − 1] is n . Therefore the average sum of array lengths split by Q UICK S ORT is 1 (T (m) + T (n − m − 1)) . · n m=0 n−1 k·2 −

k

2

i

T (n) =

n+

2k · k − (2 − 1)

(n + 1) · log2 (n + 1) − n.

To simplify, we multiply with n and obtain a second relation by substituting n − 1 for n: n−1 The running time in the best case is therefore in O(n log n).

n · T (n) = n2 + 2 ·

T (i), i=0 n−2

(1)

Randomization. One of the drawbacks of quicksort, as described until now, is that it is slow on rather common almost sorted sequences. The reason are pivots that tend to create unbalanced splittings. Such pivots tend to occur in practice more often than one might expect. Human and often also machine generated data is frequently biased towards certain distributions (in this case, permutations), and it has been said that 80% of the time or more, sorting is done on either already sorted or almost sorted files. Such situations can often be helped by transferring the algorithm’s dependence on the input data to internally made random choices. In this particular case, we use randomization to make the choice of the pivot independent of the input data. Assume R ANDOM (ℓ, r) returns an integer p ∈ [ℓ, r] with uniform probability: Prob[R ANDOM (ℓ, r) = p] = 1 r−ℓ+1

(n − 1) · T (n − 1) = (n − 1)2 + 2 ·

T (i). (2) i=0 Next we subtract (2) from (1), we divide by n(n + 1), we use repeated substitution to express T (n) as a sum, and finally split the sum in two: T (n) n+1 = = = i=1 T (n − 1) 2n − 1 + n n(n + 1) 2n − 3 2n − 1 T (n − 2) + + n−1 (n − 1)n n(n + 1) n 2i − 1 i(i + 1)

n

=



i=1

1 1 − . i + 1 i=1 i(i + 1)

n

for each ℓ ≤ p ≤ r. In other words, each p ∈ [ℓ, r] is equally likely. The following algorithm splits the array with a random pivot: int R S PLIT(int ℓ, r) p = R ANDOM (ℓ, r); S WAP(ℓ, p); return S PLIT(ℓ, r). We get a randomized implementation by substituting R S PLIT for S PLIT. The behavior of this version of quicksort depends on p, which is produced by a random number generator. Average analysis. We assume that the items in A[0..n − 1] are pairwise different. The pivot splits A into A[0..m − 1], A[m], A[m + 1..n − 1].

Bounding the sums. The second sum is solved directly by transformation to a telescoping series: n i=1

1 i(i + 1)

n

= i=1 1 1 − i i+1 1 . n+1

=

1−

1 The first sum is bounded from above by the integral of x for x ranging from 1 to n + 1; see Figure 3. The sum 1 of i+1 is the sum of areas of the shaded rectangles, and 1 because all rectangles lie below the graph of x we get a bound for the total rectangle area: n i=1

1 < i+1

n+1 1

dx = ln(n + 1). x

6

1/ x

1

2

3

4

x

Summary. Quicksort incorporates two design techniques to efficiently sort n numbers: divide-and-conquer for reducing large to small problems and randomization for avoiding the sensitivity to worst-case inputs. The average running time of quicksort is in O(n log n) and the extra amount of memory it requires is in O(log n). For the deterministic version, the average is over all n! permutations of the input items. For the randomized version the average is the expected running time for every input sequence.

Figure 3: The areas of the rectangles are the terms in the sum, and the total rectangle area is bounded by the integral from 1 through n + 1.

We plug this bound back into the expression for the average running time: n T (n) < < =

(n + 1) ·

i=1

2 i+1

2 · (n + 1) · ln(n + 1) 2 · (n + 1) · log2 (n + 1). log2 e

In words, the running time of quicksort in the average case is only a factor of about 2/ log2 e = 1.386 . . . slower than in the best case. This also implies that the worst case cannot happen very often, for else the average performance would be slower. Stack size. Another drawback of quicksort is the recursion stack, which can reach a size of Ω(n) entries. This can be improved by always first sorting the smaller side and simultaneously removing the tail-recursion: void Q UICK S ORT (int ℓ, r) i = ℓ; j = r; while i < j do m = R S PLIT (i, j); if m − i < j − m then Q UICK S ORT (i, m − 1); i = m + 1 else Q UICK S ORT (m + 1, j); j = m − 1 endif endwhile. In each recursive call to Q UICK S ORT , the length of the array is at most half the length of the array in the preceding call. This implies that at any moment of time the stack contains no more than 1 + log2 n entries. Note that without removal of the tail-recursion, the stack can reach Ω(n) entries even if the smaller side is sorted first.

7

3 Prune-and-Search
We use two algorithms for selection as examples for the prune-and-search paradigm. The problem is to find the i-smallest item in an unsorted collection of n items. We could first sort the list and then return the item in the i-th position, but just finding the i-th item can be done faster than sorting the entire list. As a warm-up exercise consider selecting the 1-st or smallest item in the unsorted array A[1..n]. min = 1; for j = 2 to n do if A[j] < A[min] then min = j endif endfor. The index of the smallest item is found in n − 1 comparisons, which is optimal. Indeed, there is an adversary argument, that is, with fewer than n − 1 comparisons we can change the minimum without changing the outcomes of the comparisons. Randomized selection. We return to finding the ismallest item for a fixed but arbitrary integer 1 ≤ i ≤ n, which we call the rank of that item. We can use the splitting function of quicksort also for selection. As in quicksort, we choose a random pivot and split the array, but we recurse only for one of the two sides. We invoke the function with the range of indices of the current subarray and the rank of the desired item, i. Initially, the range consists of all indices between ℓ = 1 and r = n, limits included. int R S ELECT(int ℓ, r, i) q = R S PLIT(ℓ, r); m = q − ℓ + 1; if i < m then return R S ELECT(ℓ, q − 1, i) elseif i = m then return q else return R S ELECT(q + 1, r, i − m) endif. For small sets, the algorithm is relatively ineffective and its running time can be improved by switching over to sorting when the size drops below some constant threshold. On the other hand, each recursive step makes some progress so that termination is guaranteed even without special treatment of small sets. Expected running time. For each 1 ≤ m ≤ n, the probability that the array is split into subarrays of sizes 1 m − 1 and n − m is n . For convenience we assume that n

is even. The expected running time increases with increasing number of items, T (k) ≤ T (m) if k ≤ m. Hence, T (n) ≤ n + ≤ n+ 1 max{T (m − 1), T (n − m)} n m=1 2 n n m= n +1 2 n

T (m − 1).

Assume inductively that T (m) ≤ cm for m < n and a sufficiently large positive constant c. Such a constant c can certainly be found for m = 1, since for that case the running time of the algorithm is only a constant. This establishes the basis of the induction. The case of n items reduces to cases of m < n items for which we can use the induction hypothesis. We thus get T (n) ≤ n + 2c n n m= n +1 2

m−1 c n · +1 2 2

= n + c · (n − 1) − = n+ 3c 3c ·n− . 4 2

Assuming c ≥ 4 we thus have T (n) ≤ cn as required. Note that we just proved that the expected running time of R S ELECT is only a small constant times that of R S PLIT. More precisely, that constant factor is no larger than four. Deterministic selection. The randomized selection algorithm takes time proportional to n2 in the worst case, for example if each split is as unbalanced as possible. It is however possible to select in O(n) time even in the worst case. The median of the set plays a special role in this algorithm. It is defined as the i-smallest item where i = n+1 2 if n is odd and i = n or n+2 if n is even. The determinis2 2 tic algorithm takes five steps to select: Step 1. Partition the n items into at most 5 each. n 5

groups of size

Step 2. Find the median in each group. Step 3. Find the median of the medians recursively. Step 4. Split the array using the median of the medians as the pivot. Step 5. Recurse on one side of the pivot. It is convenient to define k = n and to partition such 5 that each group consists of items that are multiples of k positions apart. This is what is shown in Figure 4 provided we arrange the items row by row in the array.

8

the array so we can safely use the deterministic version of splitting. Worst-case running time. To simplify the analysis, we assume that n is a multiple of 5 and ignore ceiling and floor functions. We begin by arguing that the number of items less than or equal to the median of medians is at least 3n 10 . These are the first three items in the sets with medians less than or equal to the median of medians. In Figure 4, these items are highlighted by the box to the left and above but containing the median of medians. Symmetrically, the number of items greater than or equal to the median of medians is at least 3n . The first recursion works on a set 10 of n medians, and the second recursion works on a set of 5 at most 7n items. We have 10 T (n) ≤ n+T n +T 5 7n 10 .

Figure 4: The 43 items are partitioned into seven groups of 5 and two groups of 4, all drawn vertically. The shaded items are the medians and the dark shaded item is the median of medians.

Implementation with insertion sort. We use insertion sort on each group to determine the medians. Specifically, we sort the items in positions ℓ, ℓ+k, ℓ+2k, ℓ+3k, ℓ+4k of array A, for each ℓ. void IS ORT(int ℓ, k, n) j = ℓ + k; while j ≤ n do i = j; while i > ℓ and A[i] > A[i − k] do S WAP (i, i − k); i = i − k endwhile; j =j+k endwhile. Although insertion sort takes quadratic time in the worst case, it is very fast for small arrays, as in this application. We can now combine the various pieces and write the selection algorithm in pseudo-code. Starting with the code for the randomized algorithm, we first remove the randomization and second add code for Steps 1, 2, and 3. Recall that i is the rank of the desired item in A[ℓ..r]. After sorting the groups, we have their medians arranged in the middle fifth of the array, A[ℓ + 2k..ℓ + 3k − 1], and we compute the median of the medians by recursive application of the function. int S ELECT(int ℓ, r, i) k = ⌈(r − ℓ + 1)/5⌉; for j = 0 to k − 1 do IS ORT(ℓ + j, k, r) endfor; m′ = S ELECT(ℓ + 2k, ℓ + 3k − 1, ⌊(k + 1)/2⌋); S WAP(ℓ, m′ ); q = S PLIT(ℓ, r); m = q − ℓ + 1; if i < m then return S ELECT(ℓ, q − 1, i) elseif i = m then return q else return S ELECT(q + 1, r, i − m) endif. Observe that the algorithm makes progress as long as there are at least three items in the set, but we need special treatment of the cases of one or of two items. The role of the median of medians is to prevent an unbalanced split of

We prove T (n) = O(n) by induction assuming T (m) ≤ c · m for m < n and c a large enough constant. T (n) ≤ = n+ 7c c ·n+ ·n 5 10 9c · n. 1+ 10

Assuming c ≥ 10 we have T (n) ≤ cn, as required. Again the running time is at most some constant times that of splitting the array. The constant is about two and a half times the one for the randomized selection algorithm. A somewhat subtle issue is the presence of equal items in the input collection. Such occurrences make the function S PLIT unpredictable since they could occur on either side of the pivot. An easy way out of the dilemma is to make sure that the items that are equal to the pivot are treated as if they were smaller than the pivot if they occur in the first half of the array and they are treated as if they were larger than the pivot if they occur in the second half of the array. Summary. The idea of prune-and-search is very similar to divide-and-conquer, which is perhaps the reason why some textbooks make no distinction between the two. The characteristic feature of prune-and-search is that the recursion covers only a constant fraction of the input set. As we have seen in the analysis, this difference implies a better running time. It is interesting to compare the randomized with the deterministic version of selection:

9

• the use of randomization leads to a simpler algorithm but it requires a source of randomness; • upon repeating the algorithm for the same data, the deterministic version goes through the exact same steps while the randomized version does not; • we analyze the worst-case running time of the deterministic version and the expected running time (for the worst-case input) of the randomized version. All three differences are fairly universal and apply to other algorithms for which we have the choice between a deterministic and a randomized implementation.

10

4 Dynamic Programming
Sometimes, divide-and-conquer leads to overlapping subproblems and thus to redundant computations. It is not uncommon that the redundancies accumulate and cause an exponential amount of wasted time. We can avoid the waste using a simple idea: solve each subproblem only once. To be able to do that, we have to add a certain amount of book-keeping to remember subproblems we have already solved. The technical name for this design paradigm is dynamic programming. Edit distance. We illustrate dynamic programming using the edit distance problem, which is motivated by questions in genetics. We assume a finite set of characters or letters, Σ, which we refer to as the alphabet, and we consider strings or words formed by concatenating finitely many characters from the alphabet. The edit distance between two words is the minimum number of letter insertions, letter deletions, and letter substitutions required to transform one word to the other. For example, the edit distance between FOOD and MONEY is at most four: FOOD → MOOD → MOND → MONED → MONEY A better way to display the editing process is the gap representation that places the words one above the other, with a gap in the first word for every insertion and a gap in the second word for every deletion: F O M O O N D E Y

an m-character string A[1..m] and an n-character string B[1..n]. Let E(i, j) be the edit distance between the prefixes of length i and j, that is, between A[1..i] and B[1..j]. The edit distance between the complete strings is therefore E(m, n). A crucial step towards the development of this algorithm is the following observation about the gap representation of an optimal edit sequence. P REFIX P ROPERTY. If we remove the last column of an optimal edit sequence then the remaining columns represent an optimal edit sequence for the remaining substrings. We can easily prove this claim by contradiction: if the substrings had a shorter edit sequence, we could just glue the last column back on and get a shorter edit sequence for the original strings. Recursive formulation. We use the Prefix Property to develop a recurrence relation for E. The dynamic programming algorithm will be a straightforward implementation of that relation. There are a couple of obvious base cases: • Erasing: we need i deletions to erase an i-character string, E(i, 0) = i. • Creating: we need j insertions to create a jcharacter string, E(0, j) = j. In general, there are four possibilities for the last column in an optimal edit sequence. • Insertion: the last entry in the top row is empty, E(i, j) = E(i, j − 1) + 1.

Columns with two different characters correspond to substitutions. The number of editing steps is therefore the number of columns that do not contain the same character twice. Prefix property. It is not difficult to see that you cannot get from FOOD to MONEY in less than four steps. However, for longer examples it seems considerably more difficult to find the minimum number of steps or to recognize an optimal edit sequence. Consider for example A A L G L O R T R U I I S T T H M I C

• Substitution: both rows have characters in the last column that are different, E(i, j) = E(i − 1, j − 1) + 1. • No action: both rows end in the same character, E(i, j) = E(i − 1, j − 1). Let P be the logical proposition A[i] = B[j] and denote by |P | its indicator variable: |P | = 1 if P is true and |P | = 0 if P is false. We can now summarize and for i, j > 0 get the edit distance as the smallest of the possibilities:    E(i, j − 1) + 1  E(i − 1, j) + 1 E(i, j) = min .   E(i − 1, j − 1) + |P |

• Deletion: the last entry in the bottom row is empty, E(i, j) = E(i − 1, j) + 1.

Is this optimal or, equivalently, is the edit distance between ALGORITHM and ALTRUISTIC six? Instead of answering this specific question, we develop a dynamic programming algorithm that computes the edit distance between

11

The algorithm. If we turned this recurrence relation directly into a divide-and-conquer algorithm, we would have the following recurrence for the running time: T (m, n) = T (m, n − 1) + T (m − 1, n) + T (m − 1, n − 1) + 1. The solution to this recurrence is exponential in m and n, which is clearly not the way to go. Instead, let us build an m + 1 times n + 1 table of possible values of E(i, j). We can start by filling in the base cases, the entries in the 0-th row and column. To fill in any other entry, we need to know the values directly to the left, directly above, and both to the left and above. If we fill the table from top to bottom and from left to right then whenever we reach an entry, the entries it depends on are already available. int E DIT D ISTANCE(int m, n) for i = 0 to m do E[i, 0] = i endfor; for j = 1 to n do E[0, j] = j endfor; for i = 1 to m do for j = 1 to n do E[i, j] = min{E[i, j − 1] + 1, E[i − 1, j] + 1, E[i − 1, j − 1] + |A[i] = B[j]|} endfor endfor; return E[m, n]. Since there are (m+1)(n+1) entries in the table and each takes a constant time to compute, the total running time is in O(mn). An example. The table constructed for the conversion of ALGORITHM to ALTRUISTIC is shown in Figure 5. Boxed numbers indicate places where the two strings have equal characters. The arrows indicate the predecessors that define the entries. Each direction of arrow corresponds to a different edit operation: horizontal for insertion, vertical for deletion, and diagonal for substitution. Dotted diagonal arrows indicate free substitutions of a letter for itself.
A L G O R I T H M

A
0 1 2 3 4 5 6 7 8 9 1 0 1 2 3 4 5 6 7 8

L
2 1 0 1 2 3 4 5 6 7

T
3 2 1 1 2 3 4 4 5 6

R
4 3 2 2 2 2 3 4 5 6

U
5 4 3 3 3 3 3 4 5 6

I
6 5 4 4 4 4 3 4 5 6

S
7 6 5 5 5 5 4 4 5 6

T
8 7 6 6 6 6 5 4 5 6

I
9 8 7 7 7 7 6 5 5 6

C
10 9 8 8 8 8 7 6 6 6

Figure 5: The table of edit distances between all prefixes of ALGORITHM and of ALTRUISTIC. The shaded area highlights the optimal edit sequences, which are paths from the upper left to the lower right corner.

A L A L A L A L

G O T G O T

R R R R

I U I I U I

S

T H T I T H T I

M C M C

S

They are easily recovered by tracing the paths backward, from the end to the beginning. The following algorithm recovers an optimal solution that also minimizes the number of insertions and deletions. We call it with the lengths of the strings as arguments, R(m, n). void R(int i, j) if i > 0 or j > 0 then switch incoming arrow: case ց: R(i − 1, j − 1); print(A[i], B[j]) case ↓: R(i − 1, j); print(A[i], ) case →: R(i, j − 1); print( , B[j]). endswitch endif.

Recovering the edit sequence. By construction, there is at least one path from the upper left to the lower right corner, but often there will be several. Each such path describes an optimal edit sequence. For the example at hand, we have three optimal edit sequences: A A L L G O T R R I U I T H T I M C

S

Summary. The structure of dynamic programming is again similar to divide-and-conquer, except that the subproblems to be solved overlap. As a consequence, we get different recursive paths to the same subproblems. To develop a dynamic programming algorithm that avoids redundant solutions, we generally proceed in two steps:

12

1. We formulate the problem recursively. In other words, we write down the answer to the whole problem as a combination of the answers to smaller subproblems. 2. We build solutions from bottom up. Starting with the base cases, we work our way up to the final solution and (usually) store intermediate solutions in a table. For dynamic programming to be effective, we need a structure that leads to at most some polynomial number of different subproblems. Most commonly, we deal with sequences, which have linearly many prefixes and suffixes and quadratically many contiguous substrings.

13

5 Greedy Algorithms
The philosophy of being greedy is shortsightedness. Always go for the seemingly best next thing, always optimize the presence, without any regard for the future, and never change your mind about the past. The greedy paradigm is typically applied to optimization problems. In this section, we first consider a scheduling problem and second the construction of optimal codes. A scheduling problem. Consider a set of activities 1, 2, . . . , n. Activity i starts at time si and finishes at time fi > si . Two activities i and j overlap if [si , fi ] ∩ [sj , fj ] = ∅. The objective is to select a maximum number of pairwise non-overlapping activities. An example is shown in Figure 6. The largest number of acd [ ] ] e [ ] g [ ] f [

one activity, namely j2 (or possibly j1 if it was not removed before). Eventually, we replace the entire feasible schedule by the greedy schedule without decreasing the number of activities. Since we could have started with a maximum feasible schedule, we conclude that the greedy schedule is also maximum. Binary codes. Next we consider the problem of encoding a text using a string of 0s and 1s. A binary code maps each letter in the alphabet of the text to a unique string of 0s and 1s. Suppose for example that the letter ‘t’ is encoded as ‘001’, ‘h’ is encoded as ‘101’, and ‘e’ is encoded as ‘01’. Then the word ‘the’ would be encoded as the concatenation of codewords: ‘00110101’. This particular encoding is unambiguous because the code is prefixfree: no codeword is prefix of another codeword. There is
0 1 0 1 h 0 t 0 1 e 1 h

a [

b [

c [

]

] h [ ] ]

0 1 t

1 e

time

Figure 6: A best schedule is c, e, f , but there are also others of the same size.

tivities can be scheduled by choosing activities with early finish times first. We first sort and reindex such that i < j implies fi ≤ fj . S = {1}; last = 1; for i = 2 to n do if flast < si then S = S ∪ {i}; last = i endif endfor. The running time is O(n log n) for sorting plus O(n) for the greedy collection of activities. It is often difficult to determine how close to the optimum the solutions found by a greedy algorithm really are. However, for the above scheduling problem the greedy algorithm always finds an optimum. For the proof let 1 = i1 < i2 < . . . < ik be the greedy schedule constructed by the algorithm. Let j1 < j2 < . . . < jℓ be any other feasible schedule. Since i1 = 1 has the earliest finish time of any activity, we have fi1 ≤ fj1 . We can therefore add i1 to the feasible schedule and remove at most one activity, namely j1 . Among the activities that do not overlap i1 , i2 has the earliest finish time, hence fi2 ≤ fj2 . We can again add i2 to the feasible schedule and remove at most

Figure 7: Letters correspond to leaves and codewords correspond to maximal paths. A left edge is read as ‘0’ and a right edge as ‘1’. The tree to the right is full and improves the code.

a one-to-one correspondence between prefix-free binary codes and binary trees where each leaf is a letter and the corresponding codeword is the path from the root to that leaf. Figure 7 illustrates the correspondence for the above 3-letter code. Being prefix-free corresponds to leaves not having children. The tree in Figure 7 is not full because three of its internal nodes have only one child. This is an indication of waste. The code can be improved by replacing each node with one child by its child. This changes the above code to ‘00’ for ‘t’, ‘1’ for ‘h’, and ‘01’ for ‘e’. Huffman trees. Let wi be the frequency of the letter ci in the given text. It will be convenient to refer to wi as the weight of ci or of its external node. To get an efficient code, we choose short codewords for common letters. Suppose δi is the length of the codeword for ci . Then the number of bits for encoding the entire text is P = i wi · δi .

Since δi is the depth of the leaf ci , P is also known as the weighted external path length of the corresponding tree.

14

The Huffman tree for the ci minimizes the weighted external path length. To construct this tree, we start with n nodes, one for each letter. At each stage of the algorithm, we greedily pick the two nodes with smallest weights and make them the children of a new node with weight equal to the sum of two weights. We repeat until only one node remains. The resulting tree for a collection of nine letters with displayed weights is shown in Figure 8. Ties that
5 1 4 10 23 61 7 13 38 3 21 5 9 3 6 17 8

Tree H UFFMAN loop µ = E XTRACT M IN (N ); if N = ∅ then return µ endif; ν = E XTRACT M IN (N ); create node κ with children µ and ν and weight w(κ) = w(µ) + w(ν); add κ to N forever. Straightforward implementations use an array or a linked list and take time O(n) for each operation involving N . There are fewer than 2n extractions of the minimum and fewer than n additions, which implies that the total running time is O(n2 ). We will see later that there are better ways to implement N leading to running time O(n log n). An inequality. We prepare the proof that the Huffman tree indeed minimizes the weighted external path length. Let T be a full binary tree with weighted external path length P (T ). Let Λ(T ) be the set of leaves and let µ and ν be any two leaves with smallest weights. Then we can construct a new tree T ′ with ˙ (1) set of leaves Λ(T ′ ) = (Λ(T ) − {µ, ν}) ∪ {κ} ,

Figure 8: The numbers in the external nodes (squares) are the weights of the corresponding letters, and the ones in the internal nodes (circles) are the weights of these nodes. The Huffman tree is full by construction.

61 23 10 5
000

(2) w(κ) = w(µ) + w(ν),
38

(3) P (T ′ ) ≤ P (T ) − w(µ) − w(ν), with equality if µ and ν are siblings.
21

13 5
001

17 7 8
100

6
010

9
101

We now argue that T ′ really exists. If µ and ν are siblings then we construct T ′ from T by removing µ and ν and declaring their parent, κ, as the new leaf. Then

11

3
0110

4 1
01110

3
01111

ν µ σ µ ν

σ

Figure 9: The weighted external path length is 15 + 15 + 18 + 12 + 5 + 15 + 24 + 27 + 42 = 173.

arise during the algorithm are broken arbitrarily. We redraw the tree and order the children of a node as left and right child arbitrarily, as shown in Figure 9. The algorithm works with a collection N of nodes which are the roots of the trees constructed so far. Initially, each leaf is a tree by itself. We denote the weight of a node by w(µ) and use a function E XTRACT M IN that returns the node with the smallest weight and, at the same time, removes this node from the collection.

Figure 10: The increase in the depth of ν is compensated by the decrease in depth of the leaves in the subtree of σ.

P (T ′ ) = =

P (T ) − w(µ)δ − w(ν)δ + w(κ)(δ − 1) P (T ) − w(µ) − w(ν),

where δ = δ(µ) = δ(ν) = δ(κ) + 1 is the common depth of µ and ν. Otherwise, assume δ(µ) ≥ δ(ν) and let σ be

15

the sibling of µ, which may or may not be a leaf. Exchange ν and σ. Since the length of the path from the root to σ is at least as long as the path to µ, the weighted external path length can only decrease; see Figure 10. Then do the same as in the other case. Proof of optimality. The optimality of the Huffman tree can now be proved by induction. H UFFMAN T REE T HEOREM . Let T be the Huffman tree and X another tree with the same set of leaves and weights. Then P (T ) ≤ P (X). P ROOF. If there are only two leaves then the claim is obvious. Otherwise, let µ and ν be the two leaves selected by the algorithm. Construct trees T ′ and X ′ with P (T ′ ) = P (T ) − w(µ) − w(ν), P (X ′ ) ≤ P (X) − w(µ) − w(ν). T ′ is the Huffman tree for n − 1 leaves so we can use the inductive assumption and get P (T ′ ) ≤ P (X ′ ). It follows that P (T ) = ≤ P (T ′ ) + w(µ) + w(ν) P (X ′ ) + w(µ) + w(ν) P (X).



Huffman codes are binary codes that correspond to Huffman trees as described. They are commonly used to compress text and other information. Although Huffman codes are optimal in the sense defined above, there are other codes that are also sensitive to the frequency of sequences of letters and this way outperform Huffman codes for general text. Summary. The greedy algorithm for constructing Huffman trees works bottom-up by stepwise merging, rather than top-down by stepwise partitioning. If we run the greedy algorithm backwards, it becomes very similar to dynamic programming, except that it pursues only one of many possible partitions. Often this implies that it leads to suboptimal solutions. Nevertheless, there are problems that exhibit enough structure that the greedy algorithm succeeds in finding an optimum, and the scheduling and coding problems described above are two such examples.

16

First Homework Assignment
Write the solution to each problem on a single page. The deadline for handing in solutions is September 18. Problem 1. (20 points). Consider two sums, X = x1 + x2 + . . . + xn and Y = y1 + y2 + . . . + ym . Give an algorithm that finds indices i and j such that swapping xi with yj makes the two sums equal, that is, X − xi + yj = Y − yj + xi , if they exist. Analyze your algorithm. (You can use sorting as a subroutine. The amount of credit depends on the correctness of the analysis and the running time of your algorithm.) Problem 2. (20 = 10 + 10 points). Consider distinct items x1 , x2 , . . . , xn with positive weights n w1 , w2 , . . . , wn such that The i=1 wi = 1.0. weighted median is the item xk that satisfies wi < 0.5 and xi xk

(a) Analyze the asymptotic running time of the procedure. (b) Describe and analyze a more efficient algorithm for finding the cheapest game. Problem 4. (20 = 10 + 10 points). Consider a set of n intervals [ai , bi ] that cover the unit interval, that is, [0, 1] is contained in the union of the intervals. (a) Describe an algorithm that computes a minimum subset of the intervals that also covers [0, 1]. (b) Analyze the running time of your algorithm. (For question (b) you get credit for the correctness of your analysis but also for the running time of your algorithm. In other words, a fast algorithm earns you more points than a slow algorithm.) Problem 5. (20 = 7 + 7 + 6 points). Let A[1..m] and B[1..n] be two strings. (a) Modify the dynamic programming algorithm for computing the edit distance between A and B for the case in which there are only two allowed operations, insertions and deletions of individual letters. (b) A (not necessarily contiguous) subsequence of A is defined by the increasing sequence of its indices, 1 ≤ i1 < i2 < . . . < ik ≤ m. Use dynamic programming to find the longest common subsequence of A and B and analyze its running time. (c) What is the relationship between the edit distance defined in (a) and the longest common subsequence computed in (b)?

wj ≤ 0.5.

(a) Show how to compute the weighted median of n items in worst-case time O(n log n) using sorting. (b) Show how to compute the weighted median in worst-case time O(n) using a linear-time median algorithm. Problem 3. (20 = 6 + 14 points). A game-board has n columns, each consisting of a top number, the cost of visiting the column, and a bottom number, the maximum number of columns you are allowed to jump to the right. The top number can be any positive integer, while the bottom number is either 1, 2, or 3. The objective is to travel from the first column off the board, to the right of the nth column. The cost of a game is the sum of the costs of the visited columns. Assuming the board is represented in a twodimensional array, B[2, n], the following recursive procedure computes the cost of the cheapest game: int C HEAPEST(int i) if i > n then return 0 endif; x = B[1, i] + C HEAPEST(i + 1); y = B[1, i] + C HEAPEST(i + 2); z = B[1, i] + C HEAPEST(i + 3); case B[2, i] = 1: return x; B[2, i] = 2: return min{x, y}; B[2, i] = 3: return min{x, y, z} endcase.

17

II

S EARCHING

6 7 8 9

Binary Search Trees Red-black Trees Amortized Analysis Splay Trees Second Homework Assignment

18

6 Binary Search Trees
One of the purposes of sorting is to facilitate fast searching. However, while a sorted sequence stored in a linear array is good for searching, it is expensive to add and delete items. Binary search trees give you the best of both worlds: fast search and fast update. Definitions and terminology. We begin with a recursive definition of the most common type of tree used in algorithms. A (rooted) binary tree is either empty or a node (the root) with a binary tree as left subtree and binary tree as right subtree. We store items in the nodes of the tree. It is often convenient to say the items are the nodes. A binary tree is sorted if each item is between the smaller or equal items in the left subtree and the larger or equal items in the right subtree. For example, the tree illustrated in Figure 11 is sorted assuming the usual ordering of English characters. Terms for relations between family members such as child, parent, sibling are also used for nodes in a tree. Every node has one parent, except the root which has no parent. A leaf or external node is one without children; all other nodes are internal. A node ν is a descendent of µ if ν = µ or ν is a descendent of a child of µ. Symmetrically, µ is an ancestor of ν if ν is a descendent of µ. The subtree of µ consists of all descendents of µ. An edge is a parent-child pair. r g c b d i k

of edges. For every node µ, there is a unique path from the root to µ. The length of that path is the depth of µ. The height of the tree is the maximum depth of any node. The path length is the sum of depths over all nodes, and the external path length is the same sum restricted to the leaves in the tree. Searching. A binary search tree is a sorted binary tree. We assume each node is a record storing an item and pointers to two children: struct Node{item info; Node ∗ ℓ, ∗ r}; typedef Node ∗ Tree. Sometimes it is convenient to also store a pointer to the parent, but for now we will do without. We can search in a binary search tree by tracing a path starting at the root. Node ∗ S EARCH(Tree ̺, item x) case ̺ = NULL: return NULL; x < ̺ → info: return S EARCH (̺ → ℓ, x); x = ̺ → info: return ̺; x > ̺ → info: return S EARCH (̺ → r, x) endcase. The running time depends on the length of the path, which is at most the height of the tree. Let n be the size. In the worst case the tree is a linked list and searching takes time O(n). In the best case the tree is perfectly balanced and searching takes only time O(log n). Insert. To add a new item is similarly straightforward: follow a path from the root to a leaf and replace that leaf by a new node storing the item. Figure 12 shows the tree obtained after adding w to the tree in Figure 11. The runr g c j d i k v

y j l m v z

Figure 11: The parent, sibling and two children of the dark node are shaded. The internal nodes are drawn as circles while the leaves are drawn as squares.

y z The size of the tree is the number of nodes. A binary tree is full if every internal node has two children. Every full binary tree has one more leaf than internal node. To count its edges, we can either count 2 for each internal node or 1 for every node other than the root. Either way, the total number of edges is one less than the size of the tree. A path is a sequence of contiguous edges without repetitions. Usually we only consider paths that descend or paths that ascend. The length of a path is the number

b

l m w

Figure 12: The shaded nodes indicate the path from the root we traverse when we insert w into the sorted tree.

ning time depends again on the length of the path. If the insertions come in a random order then the tree is usually

19

close to being perfectly balanced. Indeed, the tree is the same as the one that arises in the analysis of quicksort. The expected number of comparisons for a (successful) search is one n-th of the expected running time of quicksort, which is roughly 2 ln n.

the n items is n 1 + C(T ) = i=1 pi · (δi + 1) n = Delete. The main idea for deleting an item is the same as for inserting: follow the path from the root to the node ν that stores the item. Case 1. ν has no internal node as a child. Remove ν. Case 2. ν has one internal child. Make that child the child of the parent of ν. Case 3. ν has two internal children. Find the rightmost internal node in the left subtree, remove it, and substitute it for ν, as shown in Figure 13.

1+ i=1 p i · δi ,

where δi is the depth of the node that stores ai . C(T ) is the weighted path length or the cost of T . We study the problem of constructing a tree that minimizes the cost. 1 To develop an example, let n = 3 and p1 = 2 , p2 = 1 1 3 , p3 = 6 . Figure 14 shows the five binary trees with three nodes and states their costs. It can be shown that the a1 a2 a3 a1 a3 a2 a1 a2 a3 a1 a2 a1 a3 a2 a3

Figure 14: There are five different binary trees of three nodes. 2 4 From left to right their costs are 2 , 5 , 3 , 7 , 3 . The first tree and 3 6 6 the third tree are both optimal. ν K

ν

J

J

1 number of different binary trees with n nodes is n+1 2n , n which is exponential in n. This is far too large to try all possibilities, so we need to look for a more efficient way to construct an optimum tree.

Figure 13: Store J in ν and delete the node that used to store J.

The analysis of the expected search time in a binary search tree constructed by a random sequence of insertions and deletions is considerably more challenging than if no deletions are present. Even the definition of a random sequence is ambiguous in this case.

Optimal binary search trees. Instead of hoping the incremental construction yields a shallow tree, we can construct the tree that minimizes the search time. We consider the common problem in which items have different probabilities to be the target of a search. For example, some words in the English dictionary are more commonly searched than others and are therefore assigned a higher probability. Let a1 < a2 < . . . < an be the items and pi the corresponding probabilities. To simplify the discussion, we only consider successful searches and thus asn sume i=1 pi = 1. The expected number of comparisons for a successful search in a binary search tree T storing

Dynamic programming. We write Tij for the optimum j weighted binary search tree of ai , ai+1 , . . . , aj , Ci for its j j cost, and pi = k=i pk for the total probability of the items in Tij . Suppose we know that the optimum tree stores item ak in its root. Then the left subtree is Tik−1 j and the right subtree is Tk+1 . The cost of the optimum j j k−1 tree is therefore Ci = Ci + Ck+1 + pj − pk . Since we i do not know which item is in the root, we try all possibilities and find the minimum: j Ci

=

i≤k≤j

j k−1 min {Ci + Ck+1 + pj − pk }. i

This formula can be translated directly into a dynamic programming algorithm. We use three two-dimensional arrays, one for the sums of probabilities, pj , one for the costs i j of optimum trees, Ci , and one for the indices of the items j stored in their roots, Ri . We assume that the first array has already been computed. We initialize the other two arrays along the main diagonal and add one dummy diagonal for the cost.

20

for k = 1 to n do C[k, k − 1] = C[k, k] = 0; R[k, k] = k endfor; C[n + 1, n] = 0. We fill the rest of the two arrays one diagonal at a time. for ℓ = 2 to n do for i = 1 to n − ℓ + 1 do j = i + ℓ − 1; C[i, j] = ∞; for k = i to j do cost = C[i, k − 1] + C[k + 1, j] + p[i, j] − p[k, k]; if cost < C[i, j] then C[i, j] = cost; R[i, j] = k endif endfor endfor endfor. The main part of the algorithm consists of three nested loops each iterating through at most n values. The running time is therefore in O(n3 ).

Improved running time. Notice that the array R in Table 2 is monotonic, both along rows and along columns. j−1 j Indeed it is possible to prove Ri ≤ Ri in every row and j j Ri ≤ Ri+1 in every column. We omit the proof and show how the two inequalities can be used to improve the dynamic programming algorithm. Instead of trying all roots from i through j we restrict the innermost for-loop to for k = R[i, j − 1] to R[i + 1, j] do The monotonicity property implies that this change does not alter the result of the algorithm. The running time of a single iteration of the outer for-loop is now n−ℓ+1 Uℓ (n) = i=1 j j−1 (Ri+1 − Ri + 1).

Recall that j = i + ℓ − 1 and note that most terms cancel, giving Uℓ (n) = ≤
ℓ−1 n Rn−ℓ+2 − R1 + (n − ℓ + 1) 2n.

Example. Table 1 shows the partial sums of probabilities for the data in the earlier example. Table 2 shows
6p 1 2 3 1 3 2 5 2 3 6 3 1

In words, each iteration of the outer for-loop takes only time O(n), which implies that the entire algorithm takes only time O(n2 ).

Table 1: Six times the partial sums of probabilities used by the dynamic programming algorithm.

the costs and the indices of the roots of the optimum trees computed for all contiguous subsequences. The optimum
6C 1 2 3 1 0 2 2 0 3 4 1 0 R 1 2 3 1 1 2 1 2 3 1 2 3

Table 2: Six times the costs and the roots of the optimum trees.

tree can be constructed from R as follows. The root stores the item with index R[1, 3] = 1. The left subtree is therefore empty and the right subtree stores a2 , a3 . The root of the optimum right subtree stores the item with index R[2, 3] = 2. Again the left subtree is empty and the right subtree consists of a single node storing a3 .

21

7 Red-Black Trees a b a b or b a

Binary search trees are an elegant implementation of the dictionary data type, which requires support for item S EARCH (item), void I NSERT (item), void D ELETE (item), and possible additional operations. Their main disadvantage is the worst case time Ω(n) for a single operation. The reasons are insertions and deletions that tend to get the tree unbalanced. It is possible to counteract this tendency with occasional local restructuring operations and to guarantee logarithmic time per operation. 2-3-4 trees. A special type of balanced tree is the 2-3-4 tree. Each internal node stores one, two, or three items and has two, three, or four children. Each leaf has the same depth. As shown in Figure 15, the items in the internal nodes separate the items stored in the subtrees and thus facilitate fast searching. In the smallest 2-3-4 tree of
7 15 2 4 9 17 20 25

b a b c a c

Figure 16: Transforming a 2-3-4 tree into a binary tree. Bold edges are called red and the others are called black.

The number of black edges on a maximal descending path is the black height, denoted as bh(̺). When we transform a 2-3-4 tree into a binary tree as in Figure 16, we get a redblack tree. The result of transforming the tree in Figure 15
15 7 4 2 9 17 20 25

Figure 17: A red-black tree obtained from the 2-3-4 tree in Figure 15.

is shown in Figure 17.
Figure 15: A 2-3-4 tree of height two. All items are stored in internal nodes.

H EIGHT L EMMA . A red-black tree with n internal nodes has height at most 2 log2 (n + 1). P ROOF. The number of leaves is n + 1. Contract each red edge to get a 2-3-4 tree with n + 1 leaves. Its height is h ≤ log2 (n + 1). We have bh(̺) = h, and by Rule (1) the height of the red-black tree is at most 2bh(̺) ≤ 2 log2 (n + 1). Rotations. Restructuring a red-black tree can be done with only one operation (and its symmetric version): a rotation that moves a subtree from one side to another, as shown in Figure 18. The ordered sequence of nodes in the left tree of Figure 18 is . . . , order(A), ν, order(B), µ, order(C), . . . , and this is also the ordered sequence of nodes in the right tree. In other words, a rotation maintains the ordering. Function Z IG below implements the right rotation:

height h, every internal node has exactly two children, so we have 2h leaves and 2h − 1 internal nodes. In the largest 2-3-4 tree of height h, every internal node has four children, so we have 4h leaves and (4h − 1)/3 internal nodes. We can store a 2-3-4 tree in a binary tree by expanding a node with i > 1 items and i + 1 children into i nodes each with one item, as shown in Figure 16. Red-black trees. Suppose we color each edge of a binary search tree either red or black. The color is conveniently stored in the lower node of the edge. Such a edgecolored tree is a red-black tree if (1) there are no two consecutive red edges on any descending path and every maximal such path ends with a black edge; (2) all maximal descending paths have the same number of black edges.

22

µ ν

Zig right rotation

ν µ

2, we repair the two red edges in sequence by a single rotation of 7 (B). After adding 5, we promote 4 (C), and after adding 6, we do a double rotation of 7 (D).
10 7 13 A 4 2 C 7 10 13 B 2 5 4 7 10 13

C A B

left rotation Zag

A B C
4

Figure 18: From left to right a right rotation and from right to left a left rotation.

Node ∗ Z IG(Node ∗ µ) assert µ = NULL and ν = µ → ℓ = NULL; µ → ℓ = ν → r; ν → r = µ; return ν. Function Z AG is symmetric and performs a left rotation. Occasionally, it is necessary to perform two rotations in sequence, and it is convenient to combine them into a single operation referred to as a double rotation, as shown in Figure 19. We use a function Z IG Z AG to implement a µ ν κ ν double right rotation ZigZag

10 4 2 5 6 7 13 D 2 5 6 4 7

10 13

Figure 20: Sequence of red-black trees generated by inserting the items 10, 7, 13, 4, 2, 5, 6 in this sequence.

κ µ

D

An item x is added by substituting a new internal node for a leaf at the appropriate position. To satisfy Rule (2) of the red-black tree definition, color the incoming edge of the new node red, as shown in Figure 21. Start the

A B C

A

B

C

D ν x

ν

Figure 19: The double right rotation at µ is the concatenation of a single left rotation at ν and a single right rotation at µ.

double right rotation and the symmetric function Z AG Z IG to implement a double left rotation. Node ∗ Z IG Z AG(Node ∗ µ) µ → ℓ = Z AG (µ → ℓ); return Z IG(µ). The double right rotation is the composition of two single rotations: Z IG Z AG (µ) = Z IG(µ) ◦ Z AG(ν). Remember that the composition of functions is written from right to left, so the single left rotation of ν precedes the single right rotation of µ. Single rotations preserve the ordering of nodes and so do double rotations.

Figure 21: The incoming edge of a newly added node is always red.

adjustment of color and structure at the parent ν of the new node. We state the properties maintained by the insertion algorithm as invariants that apply to a node ν traced by the algorithm. I NVARIANT I. The only possible violation of the redblack tree properties is that of Rule (1) at the node ν, and if ν has a red incoming edge then it has exactly one red outgoing edge. Observe that Invariant I holds right after adding x. We continue with the analysis of all the cases that may arise. The local adjustment operations depend on the neighborhood of ν. Case 1. The incoming edge of ν is black. Done.

Insertion. Before studying the details of the restructuring algorithms for red-black trees, we look at the trees that arise in a short insertion sequence, as shown in Figure 20. After adding 10, 7, 13, 4, we have two red edges in sequence and repair this by promoting 10 (A). After adding

23

Case 2. The incoming edge of ν is red. Let µ be the parent of ν and assume ν is left child of µ. Case 2.1. Both outgoing edges of µ are red, as in Figure 22. Promote µ. Let ν be the parent of µ and recurse.

π

ν

Figure 24: Deletion of node π. The dashed edge counts as two black edges when we compute the black depth.

µ ν ν

µ

Figure 22: Promotion of µ. (The colors of the outgoing edges of ν may be the other way round).

Note that Invariant D holds right after we remove π. We now present the analysis of all the possible cases. The adjustment operation is chosen depending on the local neighborhood of ν. Case 1. The incoming edge of ν is black. Done. Case 2. The incoming edge of ν is double-black. Let µ be the parent and κ the sibling of ν. Assume ν is left child of µ and note that κ is internal. Case 2.1. The edge from µ to κ is black. Case 2.1.1. Both outgoing edges of κ are black, as in Figure 25. Demote µ. Recurse for ν = µ.

Case 2.2. Only one outgoing edge of µ is red, namely the one from µ to ν. Case 2.2.1. The left outgoing edge of ν is red, as in Figure 23 to the left. Right rotate µ. Done. µ ν ν µ ν σ µ ν σ µ

µ ν κ ν

µ κ

Figure 23: Right rotation of µ to the left and double right rotation of µ to the right.

Case 2.2.2. The right outgoing edge of ν is red, as in Figure 23 to the right. Double right rotate µ. Done. Case 2 has a symmetric case where left and right are interchanged. An insertion may cause logarithmically many promotions but at most two rotations. µ Figure 25: Demotion of µ.

Case 2.1.2. The right outgoing edge of κ is red, as in Figure 26 to the left. Change the color of that edge to black and left rotate µ. Done.

κ κ ν µ ν

µ κ σ ν µ

σ κ

Deletion. First find the node π that is to be removed. If necessary, we substitute the inorder successor for π so we can assume that both children of π are leaves. If π is last in inorder we substitute symmetrically. Replace π by a leaf ν, as shown in Figure 24. If the incoming edge of π is red then change it to black. Otherwise, remember the incoming edge of ν as ‘double-black’, which counts as two black edges. Similar to insertions, it helps to understand the deletion algorithm in terms of a property it maintains. I NVARIANT D. The only possible violation of the redblack tree properties is a double-black incoming edge of ν.

ν

Figure 26: Left rotation of µ to the left and double left rotation of µ to the right.

Case 2.1.3. The right outgoing edge of κ is black, as in Figure 26 to the right. Change the color of the left outgoing edge to black and double left rotate µ. Done. Case 2.2. The edge from µ to κ is red, as in Figure 27. Left rotate µ. Recurse for ν.

24

µ ν κ ν µ

κ

Figure 27: Left rotation of µ.

Case 2 has a symmetric case in which ν is the right child of µ. Case 2.2 seems problematic because it recurses without moving ν any closer to the root. However, the configuration excludes the possibility of Case 2.2 occurring again. If we enter Cases 2.1.2 or 2.1.3 then the termination is immediate. If we enter Case 2.1.1 then the termination follows because the incoming edge of µ is red. The deletion may cause logarithmically many demotions but at most three rotations. Summary. The red-black tree is an implementation of the dictionary data type and supports the operations search, insert, delete in logarithmic time each. An insertion or deletion requires the equivalent of at most three single rotations. The red-black tree also supports finding the minimum, maximum and the inorder successor, predecessor of a given node in logarithmic time each.

25

8 Amortized Analysis
Amortization is an analysis technique that can influence the design of algorithms in a profound way. Later in this course, we will encounter data structures that owe their very existence to the insight gained in performance due to amortized analysis. Binary counting. We illustrate the idea of amortization by analyzing the cost of counting in binary. Think of an integer as a linear array of bits, n = i≥0 A[i] · 2i . The following loop keeps incrementing the integer stored in A. loop i = 0; while A[i] = 1 do A[i] = 0; i++ endwhile; A[i] = 1. forever. We define the cost of counting as the total number of bit changes that are needed to increment the number one by one. What is the cost to count from 0 to n? Figure 28 shows that counting from 0 to 15 requires 26 bit changes. Since n takes only 1 + ⌊log2 n⌋ bits or positions in A,
5 4 3 2 1 0

total number of bit changes is therefore n−1 k j=1

T (n) = i=0 (ti + 1) = (n + 1) ·

j . 2j

We use index transformation to show that the sum on the right is less than 2: j 2j = j≥1 j≥1

j−1 2j−1 j − 2j 1 2j−1

= =

2· 2.

j≥1

j≥1

Hence the cost is T (n) < 2(n + 1). The amortized cost per operation is T (n) , which is about 2. n

0 0 0 0 0 0

0 0 0 0 0 1

0 0 0 0 1 0

0 0 0 0 1 1

0 0 0 1 0 0

0 0 0 1 0 1

0 0 0 1 1 0

0 0 0 1 1 1

0 0 1 0 0 0

0 0 1 0 0 1

0 0 1 0 1 0

0 0 1 0 1 1

0 0 1 1 0 0

0 0 1 1 0 1

0 0 1 1 1 0

0 0 1 1 1 1

Figure 28: The numbers are written vertically from top to bottom. The boxed bits change when the number is incremented.

Accounting. The idea of the accounting method is to charge each operation what we think its amortized cost is. If the amortized cost exceeds the actual cost, then the surplus remains as a credit associated with the data structure. If the amortized cost is less than the actual cost, the accumulated credit is used to pay for the cost overflow. Define the amortized cost of a bit change 0 → 1 as $2 and that of 1 → 0 as $0. When we change 0 to 1 we pay $1 for the actual expense and $1 stays with the bit, which is now 1. This $1 pays for the (later) cost of changing the 1 to 0. Each increment has amortized cost $2, and together with the money in the system, this is enough to pay for all the bit changes. The cost is therefore at most 2n. We see how a little trick, like making the 0 → 1 changes pay for the 1 → 0 changes, leads to a very simple analysis that is even more accurate than the one obtained by aggregation.

a single increment does at most 2 + log2 n steps. This implies that the cost of counting from 0 to n is at most n log2 n + 2n. Even though the upper bound of 2 + log2 n is almost tight for the worst single step, we can show that the total cost is much less than n times that. We do this with two slightly different amortization methods referred to as aggregation and accounting. Aggregation. The aggregation method takes a global view of the problem. The pattern in Figure 28 suggests we define bi equal to the number of 1s and ti equal to the number of trailing 1s in the binary notation of i. Every other number has no trailing 1, every other number of the remaining ones has one trailing 1, etc. Assuming n = 2k − 1, we therefore have exactly j − 1 trailing 1s for 2k−j = (n + 1)/2j integers between 0 and n − 1. The

Potential functions. We can further formalize the amortized analysis by using a potential function. The idea is similar to accounting, except there is no explicit credit saved anywhere. The accumulated credit is an expression of the well-being or potential of the data structure. Let ci be the actual cost of the i-th operation and Di the data structure after the i-th operation. Let Φi = Φ(Di ) be the potential of Di , which is some numerical value depending on the concrete application. Then we define ai = ci + Φi − Φi−1 as the amortized cost of the i-th

26

operation. The sum of amortized costs of n operations is n n

ai i=1 = i=1 n

(ci + Φi − Φi−1 ) ci + Φn − Φ 0 .

Case 1. ν has five children and a non-saturated sibling to its left or right. Move one child from ν to that sibling, as in Figure 29.

= i=1 $6

$1

$3

$0

We aim at choosing the potential such that Φ0 = 0 and Φn ≥ 0 because then we get ai ≥ ci . In words, the sum of amortized costs covers the sum of actual costs. To apply the method to binary counting we define the potential equal to the number of 1s in the binary notation, Φi = bi . It follows that Φi − Φi−1 = bi − bi−1

Figure 29: The overflowing node gives one child to a nonsaturated sibling.

= (bi−1 − ti−1 + 1) − bi−1 = 1 − ti−1 .

Case 2. ν has five children and no non-saturated sibling. Split ν into two nodes and recurse for the parent of ν, as in Figure 30. If ν has no parent then create a new root whose only children are the two nodes obtained from ν.
$3 $6 $0 $6 $1

The actual cost of the i-th operation is ci = 1 + ti−1 , and the amortized cost is ai = ci + Φi − Φi−1 = 2. We have Φ0 = 0 and Φn ≥ 0 as desired, and therefore ci ≤ ai = 2n, which is consistent with the analysis of binary counting with the aggregation and the accounting methods. 2-3-4 trees. As a more complicated application of amortization we consider 2-3-4 trees and the cost of restructuring them under insertions and deletions. We have seen 2-3-4 trees earlier when we talked about red-black trees. A set of keys is stored in sorted order in the internal nodes of a 2-3-4 tree, which is characterized by the following rules: (1) each internal node has 2 ≤ d ≤ 4 children and stores d − 1 keys; (2) all leaves have the same depth. As for binary trees, being sorted means that the left-toright order of the keys is sorted. The only meaningful definition of this ordering is the ordered sequence of the first subtree followed by the first key stored in the root followed by the ordered sequence of the second subtree followed by the second key, etc. To insert a new key, we attach a new leaf and add the key to the parent ν of that leaf. All is fine unless ν overflows because it now has five children. If it does, we repair the violation of Rule (1) by climbing the tree one node at a time. We call an internal node non-saturated if it has fewer than four children.

Figure 30: The overflowing node is split into two and the parent is treated recursively.

Deleting a key is done is a similar fashion, although there we have to battle with nodes ν that have too few children rather than too many. Let ν have only one child. We repair Rule (1) by adopting a child from a sibling or by merging ν with a sibling. In the latter case the parent of ν looses a child and needs to be visited recursively. The two operations are illustrated in Figures 31 and 32.

$3

$4

$0

$1

Figure 31: The underflowing node receives one child from a sibling.

Amortized analysis. The worst case for inserting a new key occurs when all internal nodes are saturated. The insertion then triggers logarithmically many splits. Symmetrically, the worst case for a deletion occurs when all

27

$0 $1 $4 $0

$1

Figure 32: The underflowing node is merged with a sibling and the parent is treated recursively.

internal nodes have only two children. The deletion then triggers logarithmically many mergers. Nevertheless, we can show that in the amortized sense there are at most a constant number of split and merge operations per insertion and deletion. We use the accounting method and store money in the internal nodes. The best internal nodes have three children because then they are flexible in both directions. They require no money, but all other nodes are given a positive amount to pay for future expenses caused by split and merge operations. Specifically, we store $4, $1, $0, $3, $6 in each internal node with 1, 2, 3, 4, 5 children. As illustrated in Figures 29 and 31, an adoption moves money only from ν to its sibling. The operation keeps the total amount the same or decreases it, which is even better. As shown in Figure 30, a split frees up $5 from ν and spends at most $3 on the parent. The extra $2 pay for the split operation. Similarly, a merger frees $5 from the two affected nodes and spends at most $3 on the parent. This is illustrated in Figure 32. An insertion makes an initial investment of at most $3 to pay for creating a new leaf. Similarly, a deletion makes an initial investment of at most $3 for destroying a leaf. If we charge $2 for each split and each merge operation, the money in the system suffices to cover the expenses. This implies that for n insertions and deletions we get a total of at most 3n split and merge oper2 ations. In other words, the amortized number of split and merge operations is at most 3 . 2 Recall that there is a one-to-one correspondence between 2-3-4 tree and red-black trees. We can thus translate the above update procedure and get an algorithm for red-black trees with an amortized constant restructuring cost per insertion and deletion. We already proved that for red-black trees the number of rotations per insertion and deletion is at most a constant. The above argument implies that also the number of promotions and demotions is at most a constant, although in the amortized and not in the worst-case sense as for the rotations.

28

9 Splay Trees
Splay trees are similar to red-black trees except that they guarantee good shape (small height) only on the average. They are simpler to code than red-black trees and have the additional advantage of giving faster access to items that are more frequently searched. The reason for both is that splay trees are self-adjusting. Self-adjusting binary search trees. Instead of explicitly maintaining the balance using additional information (such as the color of edges in the red-black tree), splay trees maintain balance implicitly through a self-adjusting mechanism. Good shape is a side-effect of the operations that are applied. These operations are applied while splaying a node, which means moving it up to the root of the tree, as illustrated in Figure 33. A detailed analysis will
4 3 2 1 1 2 2 3 4 1 3 2 3 4 1 4

Function S PLAY for the case the search item x is less than the item in the root. if x < ̺ → info then µ = ̺ → ℓ; if x < µ → info then µ → ℓ = S PLAY(µ → ℓ, x); return Z IG Z IG(̺) elseif x > µ → info then µ → r = S PLAY(µ → r, x); return Z IG Z AG(̺) else return Z IG(̺) endif. If x is stored in one of the children of ̺ then it is moved to the root by a single rotation. Otherwise, it is splayed recursively to the third level and moved to the root either by a double or a roller-coaster rotation. The number of rotation depends on the length of the path from ̺ to x. Specifically, if the path is i edges long then x is splayed in ⌊i/2⌋ double and roller-coaster rotations and zero or one single rotation. In the worst case, a single splay operation takes almost as many rotations as there are nodes in the tree. We will see shortly that the amortized number of rotations is at most logarithmic in the number of nodes. Amortized cost. Recall that the amortized cost of an operation is the actual cost minus the cost for work put into improving the data structure. To analyze the cost, we use a potential function that measures the well-being of the data structure. We need definitions: the size s(ν) is the number of descendents of node ν, including ν, the balance β(ν) is twice the floor of the binary logarithm of the size, β(ν) = 2⌊log2 s(ν)⌋, the potential Φ of a tree or a collection of trees is the sum of balances over all nodes, Φ = β(ν), the actual cost ci of the i-th splay operation is 1 plus the number of single rotations (counting a double or roller-coaster rotation as two single rotations). the amortized cost ai of the i-th splay operation is ai = ci + Φi − Φi−1 . We have Φ0 = 0 for the empty tree and Φi ≥ 0 in general. This implies that the total actual cost does not exceed the total amortized cost, ci = ai − Φn + Φ0 ≤ ai .

Figure 33: The node storing 1 is splayed using three single rotations.

reveal that single rotations do not imply good amortized performance but combinations of single rotations in pairs do. Aside from double rotations, we use roller-coaster rotations that compose two single left or two single right rotations, as shown in Figure 35. The sequence of the two single rotations is important, namely first the higher then the lower node. Recall that Z IG (κ) performs a single right rotation and returns the new root of the rotated subtree. The roller-coaster rotation to the right is then Node ∗ Z IG Z IG(Node ∗ κ) return Z IG (Z IG(κ)). Function Z AG Z AG is symmetric, exchanging left and right, and functions Z IG Z AG and Z AG Z IG are the two double rotations already used for red-black trees. Splay. A splay operation finds an item and uses rotations to move the corresponding node up to the root position. Whenever possible, a double rotation or a roller-coaster rotation is used. We dispense with special cases and show

To get a feeling for the potential, we compute Φ for the two extreme cases. Note first that the integral of the

29

natural logarithm is ln x = x ln x − x and therefore log2 x = x log2 x − x/ ln 2. In the extreme unbalanced case, the balance of the i-th node from the bottom is 2⌊log2 i⌋ and the potential is n Single rotation. The amortized cost of a single rotation shown in Figure 34 is 1 for performing the rotation plus the change in the potential: a = ≤ 1 + 3[β ′ (ν) − β(ν)] 1 + β ′ (ν) + β ′ (µ) − β(ν) − β(µ)

Φ

= 2 i=1 ⌊log2 i⌋ = 2n log2 n − O(n).

because β ′ (µ) ≤ β(µ) and β(ν) ≤ β ′ (ν). µ ν ν µ

In the balanced case, we bound Φ from above by 2U (n), where U (n) = 2U ( n )+log2 n. We prove that U (n) < 2n 2 for the case when n = 2k . Consider the perfectly balanced tree with n leaves. The height of the tree is k = log2 n. We encode the term log2 n of the recurrence relation by drawing the hook-like path from the root to the right child and then following left edges until we reach the leaf level. Each internal node encodes one of the recursively surfacing log-terms by a hook-like path starting at that node. The paths are pairwise edge-disjoint, which implies that their total length is at most the number of edges in the tree, which is 2n − 2. Investment. The main part of the amortized time analysis is a detailed study of the three types of rotations: single, roller-coaster, and double. We write β(ν) for the balance of a node ν before the rotation and β ′ (ν) for the balance after the rotation. Let ν be the lowest node involved in the rotation. The goal is to prove that the amortized cost of a roller-coaster and a double rotation is at most 3[β ′ (ν) − β(ν)] each, and that of a single rotation is at most 1 + 3[β ′ (ν) − β(ν)]. Summing these terms over the rotations of a splay operation gives a telescoping series in which all terms cancel except the first and the last. To this we add 1 for the at most one single rotation and another 1 for the constant cost in definition of actual cost. I NVESTMENT L EMMA . The amortized cost of splaying a node ν in a tree ̺ is at most 2 + 3[β(̺) − β(ν)]. Before looking at the details of the three types of rotations, we prove that if two siblings have the same balance then their common parent has a larger balance. Because balances are even integers this means that the balance of the parent exceeds the balance of its children by at least 2. BALANCE L EMMA . If µ has children ν, κ and β(ν) = β(κ) = β then β(µ) ≥ β + 2. P ROOF. By definition β(ν) = 2⌊log2 s(ν)⌋ and therefore s(ν) ≥ 2β/2 . We have s(µ) = 1 + s(ν) + s(κ) ≥ 21+β/2 , and therefore β(µ) ≥ β + 2.

Figure 34: The size of µ decreases and that of ν increases from before to after the rotation.

Roller-coaster rotation. The amortized cost of a rollercoaster rotation shown in Figure 35 is a = ≤ 2 + β ′ (ν) + β ′ (µ) + β ′ (κ) − β(ν) − β(µ) − β(κ)

2 + 2[β ′ (ν) − β(ν)]

because β ′ (κ) ≤ β(κ), β ′ (µ) ≤ β ′ (ν), and β(ν) ≤ β(µ). We distinguish two cases to prove that a is bounded from above by 3[β ′ (ν) − β(ν)]. In both cases, the drop in the κ µ ν ν µ κ ν µ κ

Figure 35: If in the middle tree the balance of ν is the same as the balance of µ then by the Balance Lemma the balance of κ is less than that common balance.

potential pays for the two single rotations. Case β ′ (ν) > β(ν). The difference between the balance of ν before and after the roller-coaster rotation is at least 2. Hence a ≤ 3[β ′ (ν) − β(ν)]. Case β ′ (ν) = β(ν) = β. Then the balances of nodes ν and µ in the middle tree in Figure 35 are also equal to β. The Balance Lemma thus implies that the balance of κ in that middle tree is at most β − 2. But since the balance of κ after the roller-coaster rotation is the same as in the middle tree, we have β ′ (κ) < β. Hence a ≤ 0 = 3[β ′ (ν) − β(ν)].

30

Double rotation. The amortized cost of a double rotation shown in Figure 36 is a = 2 + β ′ (ν) + β ′ (µ) + β ′ (κ) − β(ν) − β(µ) − β(κ) ≤ 2 + [β ′ (ν) − β(ν)]

the increase in the potential, which we denote as Φ′ − Φ. Recall that the potential of a collection of trees is the sum of the balances of all nodes. Splitting the tree decreases the number of descendents and therefore the balance of the root, which implies that Φ′ − Φ < 0. It follows that the amortized cost of a split operation is less than that of a splay operation and therefore in O(β(̺)). Two splay trees can be joined into one if all items in one tree are smaller than all items in the other tree, as illustrated in Figure 38. The cost for splaying the maximum max max

because β ′ (κ) ≤ β(κ) and β ′ (µ) ≤ β(µ). We again distinguish two cases to prove that a is bounded from above by 3[β ′ (ν) − β(ν)]. In both cases, the drop in the potential pays for the two single rotations. Case β ′ (ν) > β(ν). The difference is at least 2, which implies a ≤ 3[β ′ (ν) − β(ν)], as before. Case β ′ (ν) = β(ν) = β. Then β(µ) = β(κ) = β. We have β ′ (µ) < β ′ (ν) or β ′ (κ) < β ′ (ν) by the Balance Lemma. Hence a ≤ 0 = 3[β ′ (ν) − β(ν)]. κ µ ν µ ν κ

Figure 38: We first splay the maximum in the tree with the smaller items and then link the two trees.

in the first tree is O(β(̺1 )). The potential increase caused by linking the two trees is Φ′ − Φ ≤ 2⌊log2 (s(̺1 ) + s(̺2 ))⌋ 2 log2 s(̺1 ) + 2 log2 s(̺2 ).



The amortized cost of joining is thus O(β(̺1 ) + β(̺2 )).
Figure 36: In a double rotation, the sizes of µ and κ decrease from before to after the operation.

Dictionary operations. In summary, we showed that the amortized cost of splaying a node ν in a binary search tree with root ̺ is at most 1 + 3[β(̺) − β(ν)]. We now use this result to show that splay trees have good amortized performance for all standard dictionary operations and more. To access an item we first splay it to the root and return the root even if it does not contain x. The amortized cost is O(β(̺)). Given an item x, we can split a splay tree into two, one containing all items smaller than or equal to x and the other all items larger than x, as illustrated in Figure 37. The amortized cost is the amortized cost for splaying plus x x

To insert a new item, x, we split the tree. If x is already in the tree, we undo the split operation by linking the two trees. Otherwise, we make the two trees the left and right subtrees of a new node storing x. The amortized cost for splaying is O(β(̺)). The potential increase caused by linking is Φ′ − Φ ≤ = 2⌊log2 (s(̺1 ) + s(̺2 ) + 1)⌋ β(̺).

The amortized cost of an insertion is thus O(β(̺)). To delete an item, we splay it to the root, remove the root, and join the two subtrees. Removing x decreases the potential, and the amortized cost of joining the two subtrees is at most O(β(̺)). This implies that the amortized cost of a deletion is at most O(β(̺)). Weighted search. A nice property of splay trees not shared by most other balanced trees is that they automatically adapt to biased search probabilities. It is plausible that this would be the case because items that are often accessed tend to live at or near the root of the tree. The analysis is somewhat involved and we only state the result. Each item or node has a positive weight, w(ν) > 0,

Figure 37: After splaying x to the root, we split the tree by unlinking the right subtree.

31

and we define W = ν w(ν). We have the following generalization of the Investment Lemma, which we state without proof. W EIGHTED I NVESTMENT L EMMA . The amortized cost of splaying a node ν in a tree with total weight W is at most 2 + 3 log2 (W/w(ν)). It can be shown that this result is asymptotically best possible. In other words, the amortized search time in a splay tree is at most a constant times the optimum, which is what we achieve with an optimum weighted binary search tree. In contrast to splay trees, optimum trees are expensive to construct and they require explicit knowledge of the weights.

32

Second Homework Assignment
Write the solution to each problem on a single page. The deadline for handing in solutions is October 02. Problem 1. (20 = 12 + 8 points). Consider an array A[1..n] for which we know that A[1] ≥ A[2] and A[n − 1] ≤ A[n]. We say that i is a local minimum if A[i − 1] ≥ A[i] ≤ A[i + 1]. Note that A has at least one local minimum. (a) We can obviously find a local minimum in time O(n). Describe a more efficient algorithm that does the same. (b) Analyze your algorithm. Problem 2. (20 points). A vertex cover for a tree is a subset V of its vertices such that each edge has at least one endpoint in V . It is minimum if there is no other vertex cover with a smaller number of vertices. Given a tree with n vertices, describe an O(n)-time algorithm for finding a minimum vertex cover. (Hint: use dynamic programming or the greedy method.) Problem 3. (20 points). Consider a red-black tree formed by the sequential insertion of n > 1 items. Argue that the resulting tree has at least one red edge. [Notice that we are talking about a red-black tree formed by insertions. Without this assumption, the tree could of course consist of black edges only.] Problem 4. (20 points). Prove that 2n rotations suffice to transform any binary search tree into any other binary search tree storing the same n items. Problem 5. (20 = 5 + 5 + 5 + 5 points). Consider a collection of items, each consisting of a key and a cost. The keys come from a totally ordered universe and the costs are real numbers. Show how to maintain a collection of items under the following operations: (a) A DD(k, c): assuming no item in the collection has key k yet, add an item with key k and cost c to the collection; (b) R EMOVE(k): remove the item with key k from the collection; (c) M AX(k1 , k2 ): assuming k1 ≤ k2 , report the maximum cost among all items with keys k ∈ [k1 , k2 ].

(d) C OUNT (c1 , c2 ): assuming c1 ≤ c2 , report the number of items with cost c ∈ [c1 , c2 ]; Each operation should take at most O(log n) time in the worst case, where n is the number of items in the collection when the operation is performed.

33

III

P RIORITIZING

10 11 12

Heaps and Heapsort Fibonacci Heaps Solving Recurrence Relations Third Homework Assignment

34

10 Heaps and Heapsort
A heap is a data structure that stores a set and allows fast access to the item with highest priority. It is the basis of a fast implementation of selection sort. On the average, this algorithm is a little slower than quicksort but it is not sensitive to the input ordering or to random bits and runs about as fast in the worst case as on the average. Priority queues. A data structure implements the priority queue abstract data type if it supports at least the following operations: void I NSERT (item), item F IND M IN (void), void D ELETE M IN (void). The operations are applied to a set of items with priorities. The priorities are totally ordered so any two can be compared. To avoid any confusion, we will usually refer to the priorities as ranks. We will always use integers as priorities and follow the convention that smaller ranks represent higher priorities. In many applications, F IND M IN and D ELETE M IN are combined: void E XTRACT M IN(void) r = F IND M IN ; D ELETE M IN; return r. Function E XTRACT M IN removes and returns the item with smallest rank. Heap. A heap is a particularly compact priority queue. We can think of it as a binary tree with items stored in the internal nodes, as in Figure 39. Each level is full, except
2 5 6 8 7 10 9 12 13 8 7 15

to the ranks of both its children. As a consequence, the root contains the item with smallest rank. We store the nodes of the tree in a linear array, level by level from top to bottom and each level from left to right, as shown in Figure 40. The embedding saves ex1 2 3 4 5 6 7 8 9 10 11 12

2

5

7

6

9

8 15 8

7 10 12 13

Figure 40: The binary tree is layed out in a linear array. The root is placed in A[1], its children follow in A[2] and A[3], etc.

plicit pointers otherwise needed to establish parent-child relations. Specifically, we can find the children and parent of a node by index computation: the left child of A[i] is A[2i], the right child is A[2i + 1], and the parent is A[⌊i/2⌋]. The item with minimum rank is stored in the first element: item F IND M IN (int n) assert n ≥ 1; return A[1]. Since the index along a path at least doubles each step, paths can have length at most log2 n.

Deleting the minimum. We first study the problem of repairing the heap-order if it is violated at the root, as shown in Figure 41. Let n be the length of the array. We

13 5 6 8 7 10 9 12 8 7 2 15

Figure 39: Ranks increase or, more precisely, do not decrease from top to bottom.

Figure 41: The root is exchanged with the smaller of its two children. The operation is repeated along a single path until the heap-order is repaired.

possibly the last, which is filled from left to right until we run out of items. The items are stored in heap-order: every node µ has a rank larger than or equal to the rank of its parent. Symmetrically, µ has a rank less than or equal

repair the heap-order by a sequence of swaps along a single path. Each swap is between an item and the smaller of its children:

35

void S IFT- DN(int i, n) if 2i ≤ n then k = arg min{A[2i], A[2i + 1]} if A[k] < A[i] then S WAP(i, k); S IFT- DN(k, n) endif endif. Here we assume that A[n + 1] is defined and larger than A[n]. Since a path has at most log2 n edges, the time to repair the heap-order takes time at most O(log n). To delete the minimum we overwrite the root with the last element, shorten the heap, and repair the heap-order: void D ELETE M IN(int ∗ n) A[1] = A[∗n]; ∗n−−; S IFT- DN (1, ∗n). Instead of the variable that stores n, we pass a pointer to that variable, ∗n, in order to use it as input and output parameter. Inserting. Consider repairing the heap-order if it is violated at the last position of the heap. In this case, the item moves up the heap until it reaches a position where its rank is at least as large as that of its parent. void S IFT- UP(int i) if i ≥ 2 then k = ⌊i/2⌋; if A[i] < A[k] then S WAP(i, k); S IFT- UP(k) endif endif. An item is added by first expanding the heap by one element, placing the new item in the position that just opened up, and repairing the heap-order. void I NSERT(int ∗ n, item x) ∗n++; A[∗n] = x; S IFT- UP(∗n). A heap supports F IND M IN in constant time and I NSERT and D ELETE M IN in time O(log n) each. Sorting. Priority queues can be used for sorting. The first step throws all items into the priority queue, and the second step takes them out in order. Assuming the items are already stored in the array, the first step can be done by repeated heap repair: for i = 1 to n do S IFT- UP(i) endfor.

In the worst case, the i-th item moves up all the way to the root. The number of exchanges is therefore at most n i=1 log2 i ≤ n log2 n. The upper bound is asymptotically tight because half the terms in the sum are at least log2 n = log2 n−1. It is also possible to construct the ini2 tial heap in time O(n) by building it from bottom to top. We modify the first step accordingly, and we implement the second step to rearrange the items in sorted order: void H EAP S ORT(int n) for i = n downto 1 do S IFT- DN(i, n) endfor; for i = n downto 1 do S WAP (i, 1); S IFT- DN(1, i − 1) endfor. At each step of the first for-loop, we consider the subtree with root A[i]. At this moment, the items in the left and right subtrees rooted at A[2i] and A[2i + 1] are already heaps. We can therefore use one call to function S IFT- DN to make the subtree with root A[i] a heap. We will prove shortly that this bottom-up construction of the heap takes time only O(n). Figure 42 shows the array after each iteration of the second for-loop. Note how the heap gets smaller by one element each step. A sin2 5 6 7 7 8 8 5 6 7 8 8 9 7 7 7 6 7 8 9 9 9 8 15 8 7 10 12 13

8 15 8 13 10 12 2 8 15 12 13 10 5 8 15 12 13 6 6 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2

7 10 9

8 10 9 13 15 12 7 8 10 12 13 15 7 7 7 7 7 7 7 7 7 7 7 7 7 7

9 13 10 12 15 8 8 8 8 8 8

9 10 13 15 12 8 10 12 13 15 9 12 15 13 10 9 13 15 12 10 9 15 13 12 10 9 8 8 8 8

Figure 42: Each step moves the last heap element to the root and thus shrinks the heap. The circles mark the items involved in the sift-down operation.

gle sift-down operation takes time O(log n), and in total H EAP S ORT takes time O(n log n). In addition to the input array, H EAP S ORT uses a constant number of variables

36

and memory for the recursion stack used by S IFT- DN. We can save the memory for the stack by writing function S IFT- DN as an iteration. The sort can be changed to non-decreasing order by reversing the order of items in the heap. Analysis of heap construction. We return to proving that the bottom-up approach to constructing a heap takes only O(n) time. Assuming the worst case, in which every node sifts down all the way to the last level, we draw the swaps as edges in a tree; see Figure 43. To avoid

Figure 43: Each node generates a path that shares no edges with the paths of the other nodes.

drawing any edge twice, we always first swap to the right and then continue swapping to the left until we arrive at the last level. This introduces only a small inaccuracy in our estimate. The paths cover each edge once, except for the edges on the leftmost path, which are not covered at all. The number of edges in the tree is n − 1, which implies that the total number of swaps is less than n. Equivalently, the amortized number of swaps per item is less than 1. There is a striking difference in time-complexity to sorting, which takes an amortized number of about log2 n comparisons per item. The difference between 1 and log2 n may be interpreted as a measure of how far from sorted a heap-ordered array still is.

37

11 Fibonacci Heaps
The Fibonacci heap is a data structure implementing the priority queue abstract data type, just like the ordinary heap but more complicated and asymptotically faster for some operations. We first introduce binomial trees, which are special heap-ordered trees, and then explain Fibonacci heaps as collections of heap-ordered trees. Binomial trees. The binomial tree of height h is a tree obtained from two binomial trees of height h − 1, by linking the root of one to the other. The binomial tree of height 0 consists of a single node. Binomial trees of heights up to 4 are shown in Figure 44. Each step in the construc9 12 15 13

4 10 11 8

7 9

5 9

4 10 11 8 15

5 7 9

+

15

=
12 15 13

Figure 45: Adding the shaded node to a binomial heap consisting of three binomial trees.

binary notation of n. In the example, we get 10112 + 12 = 11002. The new collection thus consists of two binomial trees with sizes 8 and 4. The size 8 tree is the old one, and the size 4 tree is obtained by first linking the two size 1 trees and then linking the resulting size 2 tree to the old size 2 tree. All this is illustrated in Figure 45. Fibonacci heaps. A Fibonacci heap is a collection of heap-ordered trees. Ideally, we would like it to be a collection of binomial trees, but we need more flexibility. It will be important to understand how exactly the nodes of a Fibonacci heap are connected by pointers. Siblings are organized in doubly-linked cyclic lists, and each node has a pointer to its parent and a pointer to one of its children, as shown in Figure 46. Besides the pointers, each node stores min 7 4 5

Figure 44: Binomial trees of heights 0, 1, 2, 3, 4. Each tree is obtained by linking two copies of the previous tree.

tion increases the height by one, increases the degree (the number of children) of the root by one, and doubles the size of the tree. It follows that a binomial tree of height h has root degree h and size 2h . The root has the largest degree of any node in the binomial tree, which implies that every node in a binomial tree with n nodes has degree at most log2 n. To store any set of items with priorities, we use a small collection of binomial trees. For an integer n, let ni be the i-th bit in the binary notation, so we can write n = i i≥0 ni 2 . To store n items, we use a binomial tree of size 2i for each ni = 1. The total number of binomial trees is thus the number of 1’s in the binary notation of n, which is at most log2 (n + 1). The collection is referred to as a binomial heap. The items in each binomial tree are stored in heap-order. There is no specific relationship between the items stored in different binomial trees. The item with minimum key is thus stored in one of the logarithmically many roots, but it is not prescribed ahead of time in which one. An example is shown in Figure 45 where 1110 = 10112 items are stored in three binomial trees with sizes 8, 2, and 1. In order to add a new item to the set, we create a new binomial tree of size 1 and we successively link binomial trees as dictated by the rules of adding 1 to the

9

9

8

10

12

13

11

15

Figure 46: The Fibonacci heap representation of the first collection of heap-ordered trees in Figure 45.

a key, its degree, and a bit that can be used to mark or unmark the node. The roots of the heap-ordered trees are doubly-linked in a cycle, and there is an explicit pointer to the root that stores the item with the minimum key. Figure 47 illustrates a few basic operations we perform on a Fibonacci heap. Given two heap-ordered trees, we link them by making the root with the bigger key the child of the other root. To unlink a heap-ordered tree or subtree, we remove its root from the doubly-linked cycle. Finally, to merge two cycles, we cut both open and connect them at

38

Deletemin. Next we consider the somewhat more involved operation of deleting the minimum key, which is done in four steps: unlinking linking merging

Step 1. Remove the node with minimum key from the root cycle. Step 2. Merge the root cycle with the cycle of children of the removed node. Step 3. As long as there are two roots with the same degree link them. Step 4. Recompute the pointer to the minimum key.

Figure 47: Cartoons for linking two trees, unlinking a tree, and merging two cycles.

their ends. Any one of these three operations takes only constant time. Potential function. A Fibonacci heap supports a variety of operations, including the standard ones for priority queues. We use a potential function to analyze their amortized cost applied to an initially empty Fibonacci heap. Letting ri be the number of roots in the root cycle and mi the number of marked nodes, the potential after the i-th operation is Φi = ri + 2mi . When we deal with a collection of Fibonacci heaps, we define its potential as the sum of individual potentials. The initial Fibonacci heap is empty, so Φ0 = 0. As usual, we let ci be the actual cost and ai = ci + Φi − Φi−1 the amortized cost of the i-th operation. Since Φ0 = 0 and Φi ≥ 0 for all i, the actual cost is less than the amortized cost: n i=1 n n

For Step 3, we use a pointer array R. Initially, R[i] = NULL for each i. For each root ̺ in the root cycle, we execute the following iteration. i = ̺ → degree; while R[i] = NULL do ̺′ = R[i]; R[i] = NULL; ̺ = L INK(̺, ̺′ ); i++ endwhile; R[i] = ̺. To analyze the amortized cost for deleting the minimum, let D(n) be the maximum possible degree of any node in a Fibonacci heap of n nodes. The number of linking operations in Step 3 is the number of roots we start with, which is less than ri−1 + D(n), minus the number of roots we end up with, which is ri . After Step 3, all roots have different degrees, which implies ri ≤ D(n)+ 1. It follows that the actual cost for the four steps is ci ≤ 1 + 1 + (ri−1 + D(n) − ri ) + (D(n) + 1) = 3 + 2D(n) + ri−1 − ri .

ci ≤

ai = rn + 2mn + i=1 i=1

ci .

For some of the operations, it is fairly easy to compute the amortized cost. We get the minimum by returning the key in the marked root. This operation does not change the potential and its amortized and actual cost is ai = ci = 1. We meld two Fibonacci heaps, H1 and H2 , by first merging the two root circles and second adjusting the pointer to the minimum key. We have ri (H) = mi (H) = ri−1 (H1 ) + ri−1 (H2 ), mi−1 (H1 ) + mi−1 (H2 ),

The potential change is Φi − Φi−1 = ri − ri−1 . The amortized cost is therefore ai = ci + Φi − Φi−1 ≤ 2D(n) + 3. We will prove next time that the maximum possible degree is at most logarithmic in the size of the Fibonacci heap, D(n) < 2 log2 (n + 1). This implies that deleting the minimum has logarithmic amortized cost. Decreasekey and delete. Besides deletemin, we also have operations that delete an arbitrary item and that decrease the key of an item. Both change the structure of the heap-ordered trees and are the reason why a Fibonacci heap is not a collection of binomial trees but of more general heap-ordered trees. The decreasekey operation replaces the item with key x stored in the node ν by x − ∆, where ∆ ≥ 0. We will see that this can be done more efficiently than to delete x and to insert x − ∆. We decrease the key in four steps.

which implies that there is no change in potential. The amortized and actual cost is therefore ai = ci = 1. We insert a key into a Fibonacci heap by first creating a new Fibonacci heap that stores only the new key and second melding the two heaps. We have one more node in the root cycle so the change in potential is Φi − Φi−1 = 1. The amortized cost is therefore ai = ci + 1 = 2.

39

Step 1. Unlink the tree rooted at ν. Step 2. Decrease the key in ν by ∆. Step 3. Add ν to the root cycle and possibly update the pointer to the minimum key. Step 4. Do cascading cuts. We will explain cascading cuts shortly, after explaining the four steps we take to delete a node ν. Before we delete a node ν, we check whether ν = min, and if it is then we delete the minimum as explained above. Assume therefore that ν = min. Step 1. Unlink the tree rooted at ν. Step 2. Merge the root-cycle with the cycle of ν’s children. Step 3. Dispose of ν. Step 4. Do cascading cuts. Figure 48 illustrates the effect of decreasing a key and of deleting a node. Both operations create trees that are not
4 9 12 15 9 13 10 11 8 7 9 2 15 5 13 10 11 8 7 9 5 9 13 4 10 11 8 2 15 7 9 5

7 4 5 10 5 4

7

5

4

Figure 49: The effect of cascading after decreasing 10 to 7. Marked nodes are shaded.

Summary analysis. As mentioned earlier, we will prove D(n) < 2 log2 (n+1) next time. Assuming this bound, we are able to compute the amortized cost of all operations. The actual cost of Step 4 in decreasekey or in delete is the number of cuts, ci . The potential changes because there are ci new roots and ci fewer marked nodes. Also, the last cut may introduce a new mark. Thus Φi − Φi−1 = ri − ri−1 + 2mi − 2mi−1 ≤ ci − 2ci + 2 = −ci + 2.

decreasekey 12 to 2

delete 4

Figure 48: A Fibonacci heap initially consisting of three binomial trees modified by a decreasekey and a delete operation.

The amortized cost is therefore ai = ci + Φi − Φi−1 ≤ ci − (2 − ci ) = 2. The first three steps of a decreasekey operation take only a constant amount of actual time and increase the potential by at most a constant amount. It follows that the amortized cost of decreasekey, including the cascading cuts in Step 4, is only a constant. Similarly, the actual cost of a delete operation is at most a constant, but Step 2 may increase the potential of the Fibonacci heap by as much as D(n). The rest is bounded from above by a constant, which implies that the amortized cost of the delete operation is O(log n). We summarize the amortized cost of the various operations supported by the Fibonacci heap: find the minimum meld two heaps insert a new item delete the minimum decrease the key of a node delete a node O(1) O(1) O(1) O(log n) O(1) O(log n)

binomial, and we use cascading cuts to make sure that the shapes of these trees are not very different from the shapes of binomial trees.

Cascading cuts. Let ν be a node that becomes the child of another node at time t. We mark ν when it loses its first child after time t. Then we unmark ν, unlink it, and add it to the root-cycle when it loses its second child thereafter. We call this operation a cut, and it may cascade because one cut can cause another, and so on. Figure 49 illustrates the effect of cascading in a heap-ordered tree with two marked nodes. The first step decreases key 10 to 7, and the second step cuts first node 5 and then node 4.

We will later see graph problems for which the difference in the amortized cost of the decreasekey and delete operations implies a significant improvement in the running time.

40

12 Solving Recurrence Relations
Recurrence relations are perhaps the most important tool in the analysis of algorithms. We have encountered several methods that can sometimes be used to solve such relations, such as guessing the solution and proving it by induction, or developing the relation into a sum for which we find a closed form expression. We now describe a new method to solve recurrence relations and use it to settle the remaining open question in the analysis of Fibonacci heaps. Annihilation of sequences. Suppose we are given an infinite sequence of numbers, A = a0 , a1 , a2 , . . . . We can multiply with a constant, shift to the left and add another sequence: kA = LA = A+B = ka0 , ka1 , ka2 , . . . , a 1 , a2 , a3 , . . . , a 0 + b 0 , a1 + b 1 , a2 + b 2 , . . . .

FACT. (L − k1 )(L − k2 ) . . . (L − kn ) annihilates all sei i i quences of the form c1 k1 + c2 k2 + . . . + cn kn . What if k = ℓ? To answer this question, we consider (L − k)2 ik i = = = (L − k) (i + 1)k i+1 − ik i+1 (L − k) k i+1 0.

More generally, we have FACT. (L − k)n annihilates all sequences of the form p(i)k i , with p(i) a polynomial of degree n − 1. Since operators annihilate only certain types of sequences, we can determine the sequence if we know the annihilating operator. The general method works in five steps: 1. Write down the annihilator for the recurrence. 2. Factor the annihilator. 3. Determine what sequence each factor annihilates. 4. Put the sequences together. 5. Solve for the constants of the solution by using initial conditions. Fibonacci numbers. We put the method to a test by considering the Fibonacci numbers defined recursively as follows: F0 F1 Fj = = = 0, 1, Fj−1 + Fj−2 , for j ≥ 2.

As an example, consider the sequence of powers of two, ai = 2i . Multiplying with 2 and shifting to the left give the same result. Therefore, LA − 2A = 0, 0, 0, . . . .

We write LA − 2A = (L − 2)A and think of L − 2 as an operator that annihilates the sequence of powers of 2. In general, L − k annihilates any sequence of the form ck i . What does L − k do to other sequences A = cℓi , when ℓ = k? (L − k)A = = (ℓ − k) c, cℓ, cℓ2 , . . . = (ℓ − k)A. cℓ, cℓ2 , cℓ3 , . . . − ck, ckℓ, ckℓ2 , . . .

Writing a few of the initial numbers, we get the sequence 0, 1, 1, 2, 3, 5, 8, . . . . We notice that L2 − L − 1 annihilates the sequence because (L2 − L − 1) Fj = L2 Fj − L Fj − Fj = =

We see that the operator L − k annihilates only one type of sequence and multiplies other similar sequences by a constant. Multiple operators. Instead of just one, we can apply several operators to a sequence. We may multiply with two constants, k(ℓA) = (kℓ)A, multiply and shift, L(kA) = k(LA), and shift twice, L(LA) = L2 A. For example, (L − k)(L − ℓ) annihilates all sequences of the form ck i + dℓi , where we assume k = ℓ. Indeed, L − k annihilates ck i and leaves behind (ℓ − k)dℓi , which is annihilated by L − ℓ. Furthermore, (L − k)(L − ℓ) annihilates no other sequences. More generally, we have

Fj+2 − Fj+1 − Fj 0.

If we factor the operator into its roots, we get L2 − L − 1 = where ϕ = ϕ = 1−ϕ = √ 1+ 5 = 1.618 . . . , 2√ 1− 5 = − 0.618 . . . . 2 (L − ϕ)(L − ϕ),

41

The first root is known as the golden ratio because it represents the aspect ratio of a rectangular piece of paper from which we may remove a square to leave a smaller rectangular piece of the same ratio: ϕ : 1 = 1 : ϕ − 1. Thus we know that (L − ϕ)(L − ϕ) annihilates Fj and this means that the j-th Fibonacci number is of the form Fj = cϕj + c ϕj . We get the constant factors from the initial conditions: F0 F1 = 0 = c + c, = 1 = cϕ + c ϕ.

For larger j, we get sj from sj−1 by adding the size of a minimum tree with root degree j−2, which is sj−2 . Hence sj = sj−1 + sj−2 , which is the same recurrence relation that defines the Fibonacci numbers. The initial values are shifted two positions so we get sj = Fj+2 , as claimed. Consider a Fibonacci heap with n nodes and let ν be a node with maximum degree D = D(n). The Size Lemma implies n ≥ FD+2 . The Fibonacci number with index √ √ D + 2 is roughly ϕD+2 / 5. Because ϕD+2 < 5, we have n ≥ 1 √ ϕD+2 − 1. 5

Solving the two linear equations in two unknowns, we get √ √ c = 1/ 5 and c = −1/ 5. This implies that Fj = 1 √ 5 √ 1+ 5 2 j 1 −√ 5

√ 1− 5 2

j

.

After rearranging the terms and taking the logarithm to the base ϕ, we get √ D ≤ logϕ 5(n + 1) − 2. Recall that logϕ x = log2 x/ log2 ϕ and use the calculator √ to verify that log2 ϕ = 0.694 . . . > 0.5 and logϕ 5 = 1.672 . . . < 2. Hence D ≤ √ log2 (n + 1) + logϕ 5 − 2 log2 ϕ < 2 log2 (n + 1).

From this viewpoint, it seems surprising that Fj turns out to be an integer for all j. Note that |ϕ| > 1 and |ϕ| < 1. It follows that for growing exponent j, ϕj goes to infinity ϕ and √j goes to zero. This implies that Fj is approximately ϕj / 5, and that this approximation becomes more and more accurate as j grows. Maximum degree. Recall that D(n) is the maximum possible degree of any one node in a Fibonacci heap of size n. We need two easy facts about the kind of trees that arise in Fibonacci heaps in order to show that D(n) is at most logarithmic in n. Let ν be a node of degree j, and let µ1 , µ2 , . . . , µj be its children ordered by the time they were linked to ν. D EGREE L EMMA . The degree of µi is at least i − 2. P ROOF. Recall that nodes are linked only during the deletemin operation. Right before the linking happens, the two nodes are roots and have the same degree. It follows that the degree of µi was at least i − 1 at the time it was linked to ν. The degree of µi might have been even higher because it is possible that ν lost some of the older children after µi had been linked. After being linked, µi may have lost at most one of its children, for else it would have been cut. Its degree is therefore at least i − 2, as claimed. S IZE L EMMA . The number of descendents of ν (including ν) is at least Fj+2 . P ROOF. Let sj be the minimum number of descendents a node of degree j can have. We have s0 = 1 and s1 = 2.

Non-homogeneous terms. We now return to the annihilation method for solving recurrence relations and consider aj = aj−1 + aj−2 + 1.

This is similar to the recurrence that defines Fibonacci numbers and describes the minimum number of nodes in an AVL tree, also known as height-balanced tree. It is defined by the requirement that the height of the two subtrees of a node differ by at most 1. The smallest tree of height j thus consists of the root, a subtree of height j − 1 and another subtree of height j − 2. We refer to the terms involving ai as the homogeneous terms of the relation and the others as the non-homogeneous terms. We know that L2 − L − 1 annihilates the homogeneous part, aj = aj−1 + aj−2 . If we apply it to the entire relation we get (L2 − L − 1) aj = = aj+2 − aj+1 − aj 1, 1, . . . .

The remaining sequence of 1s is annihilated by L − 1. In other words, (L − ϕ)(L − ϕ)(L − 1) annihilates aj implying that aj = cϕj + c ϕj + c′ 1j . It remains to find

42

the constants, which we get from the boundary conditions a0 = 1, a1 = 2 and a2 = 4: c ϕc ϕ2 c + + + c ϕc ϕ2 c + + + c′ c′ c′ = = = 1, 2, 4.

The Master Theorem. It is sometimes more convenient to look up the solution to a recurrence relation than playing with different techniques to see whether any one can make it to yield. Such a cookbook method for recurrence relations of the form T (n) = aT (n/b) + f (n)

√ Noting that ϕ2 = √ + 1, ϕ2 = ϕ +√ and ϕ − ϕ = 5 ϕ 1, we get c = (5 + 2 5)/5, c = (5 − 2 5)/5, and c′ = −1. The minimum number of nodes of a height-j AVL tree is therefore roughly the constant c times ϕj . Conversely, the maximum height of an AVL tree with n = cϕj nodes is roughly j = logϕ (n/c) = 1.440 . . . · log2 n + O(1). In words, the height-balancing condition implies logarithmic height.

is provided by the following theorem. Here we assume that a ≥ 1 and b > 1 are constants and that f is a wellbehaved positive function. M ASTER T HEOREM . Define c = logb a and let ε be an arbitrarily small positive constant. Then  if f (n) = O(nc−ε ),  O(nc ) c T (n) = O(n log n) if f (n) = O(nc ),  O(f (n)) if f (n) = Ω(nc+ε ).

Transformations. We extend the set of recurrences we can solve by employing transformations that produce relations amenable to the annihilation method. We demonstrate this by considering mergesort, which is another divide-and-conquer algorithm that can be used to sort a list of n items: Step 1. Recursively sort the left half of the list. Step 2. Recursively sort the right half of the list. Step 3. Merge the two sorted lists by simultaneously scanning both from beginning to end. The running time is described by the solution to the recurrence T (1) = T (n) = 1, 2T (n/2) + n.

The last of the three cases also requires a usually satisfied technical condition, namely that af (n/b) < δf (n) for some constant δ strictly less than 1. For example, this condition is satisfied in T (n) = 2T (n/2) + n2 which implies T (n) = O(n2 ). As another example consider the relation T (n) = 2T (n/2) + n that describes the running time of mergesort. We have c = log2 2 = 1 and f (n) = n = O(nc ). The middle case of the Master Theorem applies and we get T (n) = O(n log n), as before.

We have no way to work with terms like T (n/2) yet. However, we can transform the recurrence into a more manageable form. Defining n = 2i and ti = T (2i ) we get t0 ti = = 1, 2ti−1 + 2i .

The homogeneous part is annihilated by L − 2. Similarly, non-homogeneous part is annihilated by L − 2. Hence, (L − 2)2 annihilates the entire relation and we get ti = (ci + c)2i . Expressed in the original notation we thus have T (n) = (c log2 n + c)n = O(n log n). This result is of course no surprise and reconfirms what we learned earlier about sorting.

43

Third Homework Assignment
Write the solution to each problem on a single page. The deadline for handing in solutions is October 14. Problem 1. (20 = 10 + 10 points). Consider a lazy version of heapsort in which each item in the heap is either smaller than or equal to every other item in its subtree, or the item is identified as uncertified. To certify an item, we certify its children and then exchange it with the smaller child provided it is smaller than the item itself. Suppose A[1..n] is a lazy heap with all items uncertified. (a) How much time does it take to certify A[1]? (b) Does certifying A[1] turn A into a proper heap in which every item satisfies the heap property? (Justify your answer.) Problem 2. (20 points). Recall that Fibonacci numbers are defined recursively as F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 . Prove the square of the n-th Fibonacci number differs from the product of the two adjacent 2 numbers by one: Fn = Fn−1 · Fn+1 + (−1)n+1 . Problem 3. (20 points). Professor Pinocchio claims that the height of an n-node Fibonacci heap is at most some constant times log2 n. Show that the Professor is mistaken by exhibiting, for any integer n, a sequence of operations that create a Fibonacci heap consisting of just one tree that is a linear chain of n nodes. Problem 4. (20 = 10 + 10 points). To search in a sorted array takes time logarithmic in the size of the array, but to insert a new items takes linear time. We can improve the running time for insertions by storing the items in several instead of just one sorted arrays. Let n be the number of items, let k = ⌈log2 (n + 1)⌉, and write n = nk−1 nk−2 . . . n0 in binary notation. We use k sorted arrays Ai (some possibly empty), where Ai stores ni 2i items. Each item is stored exactly once, and the total size of the arrays is indeed k i i=0 ni 2 = n. Although each individual array is sorted, there is no particular relationship between the items in different arrays. (a) Explain how to search in this data structure and analyze your algorithm.

(b) Explain how to insert a new item into the data structure and analyze your algorithm, both in worst-case and in amortized time. Problem 5. (20 = 10 + 10 points). Consider a full binary tree with n leaves. The size of a node, s(ν), is the number of leaves in its subtree and the rank is the floor of the binary logarithm of the size, r(ν) = ⌊log2 s(ν)⌋. (a) Is it true that every internal node ν has a child whose rank is strictly less than the rank of ν? (b) Prove that there exists a leaf whose depth (length of path to the root) is at most log2 n.

44

IV

G RAPH A LGORITHMS

13 14 15 16

Graph Search Shortest Paths Minimum Spanning Trees Union-find Fourth Homework Assignment

45

13 Graph Search
We can think of graphs as generalizations of trees: they consist of nodes and edges connecting nodes. The main difference is that graphs do not in general represent hierarchical organizations. Types of graphs. Different applications require different types of graphs. The most basic type is the simple undirected graph that consists of a set V of vertices and a set E of edges. Each edge is an unordered pair (a set) of two vertices. We always assume V is finite, and we write
1 0 3 2

the weight of the edge connecting i and j. The adjacency matrix of the graph in Figure 50 is  0 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 

  A =   

  ,  

which is symmetric. Irrespective of the number of edges,

0

1

2

3

4

V
1 3 0 2 1 3 0 2 4 3

4

Figure 50: A simple undirected graph with vertices 0, 1, 2, 3, 4 and edges {0, 1}, {1, 2}, {2, 3}, {3, 0}, {3, 4}.

Figure 51: The adjacency list representation of the graph in Figure 50. Each edge is represented twice, once for each endpoint.

for the collection of all unordered pairs. Hence E is a subset of V . Note that because E is a set, each edge can 2 occur only once. Similarly, because each edge is a set (of two vertices), it cannot connect to the same vertex twice. Vertices u and v are adjacent if {u, v} ∈ E. In this case u and v are called neighbors. Other types of graphs are directed: weighted: labeled: non-simple: E ⊆V ×V. has a weighting function w : E → R. has a labeling function ℓ : V → Z. there are loops and multi-edges.

V 2

the adjacency matrix has n2 elements and thus requires a quadratic amount of space. Often, the number of edges is quite small, maybe not much larger than the number of vertices. In these cases, the adjacency matrix wastes memory, and a better choice is a sparse matrix representation referred to as adjacency lists, which is illustrated in Figure 51. It consists of a linear array V for the vertices and a list of neighbors for each vertex. For most algorithms, we assume that vertices and edges are stored in structures containing a small number of fields: struct Vertex {int d, f, π; Edge ∗adj}; struct Edge {int v; Edge ∗next}. The d, f, π fields will be used to store auxiliary information used or created by the algorithms.

A loop is like an edge, except that it connects to the same vertex twice. A multi-edge consists of two or more edges connecting the same two vertices. Representation. The two most popular data structures for graphs are direct representations of adjacency. Let V = {0, 1, . . . , n − 1} be the set of vertices. The adjacency matrix is the n-by-n matrix A = (aij ) with aij = 1 0 if {i, j} ∈ E, if {i, j} ∈ E.

For undirected graphs, we have aij = aji , so A is symmetric. For weighted graphs, we encode more information than just the existence of an edge and define aij as

Depth-first search. Since graphs are generally not ordered, there are many sequences in which the vertices can be visited. In fact, it is not entirely straightforward to make sure that each vertex is visited once and only once. A useful method is depth-first search. It uses a global variable, time, which is incremented and used to leave time-stamps behind to avoid repeated visits.

46

1 2 3

4

void V ISIT(int i) time++; V [i].d = time; forall outgoing edges ij do if V [j].d = 0 then V [j].π = i; V ISIT(j) endif endfor; time++; V [i].f = time.

1,16

2, 9

10,15

3, 6

7, 8

11,14

4, 5

12,13

The test in line 2 checks whether the neighbor j of i has already been visited. The assignment in line 3 records that the vertex is visited from vertex i. A vertex is first stamped in line 1 with the time at which it is encountered. A vertex is second stamped in line 4 with the time at which its visit has been completed. To prepare the search, we initialize the global time variable to 0, label all vertices as not yet visited, and call V ISIT for all yet unvisited vertices. time = 0; forall vertices i do V [i].d = 0 endfor; forall vertices i do if V [i].d = 0 then V [i].π = 0; V ISIT (i) endif endfor. Let n be the number of vertices and m the number of edges in the graph. Depth-first search visits every vertex once and examines every edge twice, once for each endpoint. The running time is therefore O(n + m), which is proportional to the size of the graph and therefore optimal. DFS forest. Figure 52 illustrates depth-first search by showing the time-stamps d and f and the pointers π indicating the predecessors in the traversal. We call an edge {i, j} ∈ E a tree edge if i = V [j].π or j = V [i].π and a back edge, otherwise. The tree edges form the DFS forest
3, 6 2, 9 1,16 11,14

Figure 53: Tree edges are solid and back edges are dotted.

stamps d are consistent with the preorder traversal of the DFS forest. The time-stamps f are consistent with the postorder traversal. The two stamps can be used to decide, in constant time, whether two nodes in the forest live in different subtrees or one is a descendent of the other. N ESTING L EMMA . Vertex j is a proper descendent of vertex i in the DFS forest iff V [i].d < V [j].d as well as V [j].f < V [i].f . Similarly, if you have a tree and the preorder and postorder numbers of the nodes, you can determine the relation between any two nodes in constant time. Directed graphs and relations. As mentioned earlier, we have a directed graph if all edges are directed. A directed graph is a way to think and talk about a mathematical relation. A typical problem where relations arise is scheduling. Some tasks are in a definite order while others are unrelated. An example is the scheduling of undergraduate computer science courses, as illustrated in Figure 54. Abstractly, a relation is a pair (V, E), where
Comput. Org. and Programm. Program Design and Analysis I Program Design and Analysis II Operating Systems Distributed Inform. Syst.

104

110

212

006
4, 5 7, 8 10,15 12,13

100 108
Software Design and Implementation

214
Comput. Networks and Distr. Syst.

Figure 52: The traversal starts at the vertex with time-stamp 1. Each node is stamped twice, once when it is first encountered and another time when its visit is complete.

of the graph. The forest is a tree if the graph is connected and a collection of two or more trees if it is not connected. Figure 53 shows the DFS forest of the graph in Figure 52 which, in this case, consists of a single tree. The time-

Figure 54: A subgraph of the CPS course offering. The courses CPS104 and CPS108 are incomparable, CPS104 is a predecessor of CPS110, and so on.

V = {0, 1, . . . , n − 1} is a finite set of elements and E ⊆ V × V is a finite set of ordered pairs. Instead of

47

(i, j) ∈ E we write i ≺ j and instead of (V, E) we write (V, ≺). If i ≺ j then i is a predecessor of j and j is a successor of i. The terms relation, directed graph, digraph, and network are all synonymous.

Directed acyclic graphs. A cycle in a relation is a sequence i0 ≺ i1 ≺ . . . ≺ ik ≺ i0 . Even i0 ≺ i0 is a cycle. A linear extension of (V, ≺) is an ordering j0 , j1 , . . . , jn−1 of the elements that is consistent with the relation. Formally this means that jk ≺ jℓ implies k < ℓ. A directed graph without cycle is a directed acyclic graph. E XTENSION L EMMA . (V, ≺) has a linear extension iff it contains no cycle. P ROOF. “=⇒” is obvious. We prove “⇐=” by induction. A vertex s ∈ V is called a source if it has no predecessor. Assuming (V, ≺) has no cycle, we can prove that V has a source by following edges against their direction. If we return to a vertex that has already been visited, we have a cycle and thus a contradiction. Otherwise we get stuck at a vertex s, which can only happen because s has no predecessor, which means s is a source. Let U = V −{s} and note that (U, ≺) is a relation that is smaller than (V, ≺). Hence (U, ≺) has a linear extension by induction hypothesis. Call this extension X and note that s, X is a linear extension of (V, ≺).

while queue is non-empty do s = D EQUEUE; forall successors j of s do V [j].d--; if V [j].d = 0 then E NQUEUE(j) endif endfor endwhile. The running time is linear in the number of vertices and edges, namely O(n + m). What happens if there is a cycle in the digraph? We illustrate the above algorithm for the directed acyclic graph in Figure 55. The sequence of ver0 1, 0 0

a
3, 2, 1, 0

d c e
3, 2, 1, 0

f
1, 0

g h
1, 0

b
1, 0

Figure 55: The numbers next to each vertex count the predecessors, which decreases during the algorithm.

tices added to the queue is also the linear extension computed by the algorithm. If the process starts at vertex a and if the successors of a vertex are ordered by name then we get a, f, d, g, c, h, b, e, which we can check is indeed a linear extension of the relation. Topological sorting with DFS. Another algorithm that can be used for topological sorting is depth-first search. We output a vertex when its visit has been completed, that is, when all its successors and their successors and so on have already been printed. The linear extension is therefore generated from back to front. Figure 56 shows the
15, 16 2, 9 1, 14

Topological sorting with queue. The problem of constructing a linear extension is called topological sorting. A natural and fast algorithm follows the idea of the proof: find a source s, print s, remove s, and repeat. To expedite the first step of finding a source, each vertex maintains its number of predecessors and a queue stores all sources. First, we initialize this information. forall vertices j do V [j].d = 0 endfor; forall vertices i do forall successors j of i do V [j].d++ endfor endfor; forall vertices j do if V [j].d = 0 then E NQUEUE(j) endif endfor. Next, we compute the linear extension by repeated deletion of a source.

a
3, 8

d c e
6, 7

f
10, 13

g h
11, 12

b
4, 5

Figure 56: The numbers next to each vertex are the two time stamps applied by the depth-first search algorithm. The first number gives the time the vertex is encountered, and the second when the visit has been completed.

same digraph as Figure 55 and labels vertices with time

48

stamps. Consider the sequence of vertices in the order of decreasing second time stamp: a(16), f (14), g(13), h(12), d(9), c(8), e(7), b(5). Although this sequence is different from the one computed by the earlier algorithm, it is also a linear extension of the relation.

49

14 Shortest Paths
One of the most common operations in graphs is finding shortest paths between vertices. This section discusses three algorithms for this problem: breadth-first search for unweighted graphs, Dijkstra’s algorithm for weighted graphs, and the Floyd-Warshall algorithm for computing distances between all pairs of vertices. Breadth-first search. We call a graph connected if there is a path between every pair of vertices. A (connected) component is a maximal connected subgraph. Breadthfirst search, or BFS, is a way to search a graph. It is similar to depth-first search, but while DFS goes as deep as quickly as possible, BFS is more cautious and explores a broad neighborhood before venturing deeper. The starting point is a vertex s. An example is shown in Figure 57. As e 2 a 1 s 0 d 1

A vertex is processed by adding its unvisited neighbors to the queue. They will be processed in turn. void S EARCH while queue is non-empty do i = D EQUEUE; forall neighbors j of i do if V [j].d = −1 then V [j].d = V [i].d + 1; V [j].π = i; E NQUEUE(j) endif endfor endwhile. The label V [i].d assigned to vertex i during the traversal is the minimum number of edges of any path from s to i. In other words, V [i].d is the length of the shortest path from s to i. The running time of BFS for a graph with n vertices and m edges is O(n + m). Single-source shortest path. BFS can be used to find shortest paths in unweighted graphs. We now extend the algorithm to weighted graphs. Assume V and E are the sets of vertices and edges of a simple, undirected graph with a positive weighting function w : E → R+ . The length or weight of a path is the sum of the weights of its edges. The distance between two vertices is the length of the shortest path connecting them. For a given source s ∈ V , we study the problem of finding the distances and shortest paths to all other vertices. Figure 59 illustrates the problem by showing the shortest paths to the source s. In e 5 10

2 f

1 b

1 c

2 g

Figure 57: A sample graph with eight vertices and ten edges labeled by breath-first search. The label increases from a vertex to its successors in the search.

before, we call and edge a tree edge if it is traversed by the algorithm. The tree edges define the BFS tree, which we can use to redraw the graph in a hierarchical manner, as in Figure 58. In the case of an undirected graph, no non-tree edge can connect a vertex to an ancestor in the BFS tree. Why? We use a queue to turn the idea into an algorithm.
0

a

5 10

s

5 4

d

6

4

4

10

f
1 2 2 1 1 1 2

b

c

g

Figure 59: The bold edges form shortest paths and together the shortest path tree with root s. It differs by one edge from the breadth-first tree shown in Figure 57.

Figure 58: The tree edges in the redrawing of the graph in Figure 57 are solid, and the non-tree edges are dotted.

First, the graph and the queue are initialized. forall vertices i do V [i].d = −1 endfor; V [s].d = 0; M AKE Q UEUE; E NQUEUE(s); S EARCH .

the non-degenerate case, in which no two paths have the same length, the union of all shortest paths to s is a tree, referred to as the shortest path tree. In the degenerate case, we can break ties such that the union of paths is a tree. As before, we grow a tree starting from s. Instead of a queue, we use a priority queue to determine the next vertex to be added to the tree. It stores all vertices not yet in the

50

tree and uses V [i].d for the priority of vertex i. First, we initialize the graph and the priority queue. V [s].d = 0; V [s].π = −1; I NSERT(s); forall vertices i = s do V [i].d = ∞; I NSERT(i) endfor. After initialization the priority queue stores s with priority 0 and all other vertices with priority ∞. Dijkstra’s algorithm. We mark vertices in the tree to distinguish them from vertices that are not yet in the tree. The priority queue stores all unmarked vertices i with priority equal to the length of the shortest path that goes from i in one edge to a marked vertex and then to s using only marked vertices. while priority queue is non-empty do i = E XTRACT M IN ; mark i; forall neighbors j of i do if j is unmarked then V [j].d = min{w(ij) + V [i].d, V [j].d} endif endfor endwhile. Table 3 illustrates the algorithm by showing the information in the priority queue after each iteration of the whileloop operating on the graph in Figure 59. The mark-

E XTRACT M INs D ECREASE K EYs

array n2 m

heap n log n m log m

F-heap n log n m

Table 4: Running time of Dijkstra’s algorithm for three different implementations of the priority queue holding the yet unmarked vertices.

Correctness. It is not entirely obvious that Dijkstra’s algorithm indeed finds the shortest paths to s. To show that it does, we inductively prove that it maintains the following two invariants. (A) For every unmarked vertex j, V [j].d is the length of the shortest path from j to s that uses only marked vertices other than j. (B) For every marked vertex i, V [i].d is the length of the shortest path from i to s. P ROOF. Invariant (A) is true at the beginning of Dijkstra’s algorithm. To show that it is maintained throughout the process, we need to make sure that shortest paths are computed correctly. Specifically, if we assume Invariant (B) for vertex i then the algorithm correctly updates the priorities V [j].d of all neighbors j of i, and no other priorities change.

i s a b c d e f g 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ 5 10 4 5 ∞ ∞ ∞ 5 10 5 ∞ ∞ ∞

s
9 5 10 15 ∞ 9

10 15 15

10 15 15

y
15 15 15

Table 3: Each column shows the contents of the priority queue. Time progresses from left to right.

Figure 60: The vertex y is the last unmarked vertex on the hypothetically shortest, dashed path that connects i to s.

ing mechanism is not necessary but clarifies the process. The algorithm performs n E XTRACT M IN operations and at most m D ECREASE K EY operations. We compare the running time under three different data structures used to represent the priority queue. The first is a linear array, as originally proposed by Dijkstra, the second is a heap, and the third is a Fibonacci heap. The results are shown in Table 4. We get the best result with Fibonacci heaps for which the total running time is O(n log n + m).

At the moment vertex i is marked, it minimizes V [j].d over all unmarked vertices j. Suppose that, at this moment, V [i].d is not the length of the shortest path from i to s. Because of Invariant (A), there is at least one other unmarked vertex on the shortest path. Let the last such vertex be y, as shown in Figure 60. But then V [y].d < V [i].d, which is a contradiction to the choice of i. We used (B) to prove (A) and (A) to prove (B). To make sure we did not create a circular argument, we parametrize the two invariants with the number k of vertices that are

51

marked and thus belong to the currently constructed portion of the shortest path tree. To prove (Ak ) we need (Bk ) and to prove (Bk ) we need (Ak−1 ). Think of the two invariants as two recursive functions, and for each pair of calls, the parameter decreases by one and thus eventually becomes zero, which is when the argument arrives at the base case. All-pairs shortest paths. We can run Dijkstra’s algorithm n times, once for each vertex as the source, and thus get the distance between every pair of vertices. The running time is O(n2 log n + nm) which, for dense graphs, is the same as O(n3 ). Cubic running time can be achieved with a much simpler algorithm using the adjacency matrix to store distances. The idea is to iterate n times, and after the k-th iteration, the computed distance between vertices i and j is the length of the shortest path from i to j that, other than i and j, contains only vertices of index k or less. for k = 1 to n do for i = 1 to n do for j = 1 to n do A[i, j] = min{A[i, j], A[i, k] + A[k, j]} endfor endfor endfor. The only information needed to update A[i, j] during the k-th iteration of the outer for-loop are its old value and values in the k-th row and the k-th column of the prior adjacency matrix. This row remains unchanged in this iteration and so does this column. We therefore do not have to use two arrays, writing the new values right into the old matrix. We illustrate the algorithm by showing the adjacency, or distance matrix before the algorithm in Figure 61 and after one iteration in Figure 62. s s a b c d e f g 0 5 4 5 5 10 10 a b c d 5 5 10 0 4 4 0 0 6 6 0 0 10 e f g

0 5 4

5 10 4 0 4

5

s a b c 10 0 6 6 0 0 e f g s a b c d e f 0 e f g s a b c d e f g g d e f g

0 5 9 4

5 0 4

9 4

4

5 10 15

9 10 5 10 4 0

9 10 5 10 4 14 19 0 15 20 10 6 0 0 e f g

10 4

0 14 15

0 13 14 9 14

9 14 0 5 10

9 13 0

5 10 15 4

5 10 14 4 10 5

9 14 15 0 10

15 10 14 19 20 6

10 s 0 5 9 4 a 5 0 4 b 9 4 c 4 d

s 0 5 9 4

a 5 0 4

b 9 4

c 4

d

5 10 15

5 10 15

9 10 5 10 4 14 19 0 15 20 10 6 0

9 10 5 10 0 4 14 19 0 15 20 10 6 0 0 e f g

0 13 14 9 14

0 13 14 9 14

9 13 0

9 13

5 10 14 4 10 5

5 10 14 4 10 5

9 14 15 0 10

9 14 15 0 10

15 10 14 19 20 6

15 10 14 19 20 6

s 0 5 9 4

a 5 0 4

b 9 4

c 4

d

s 0 5 9 4

a 5 0 4

b 9 4

c 4

d

5 10 15 15

5 10 15 15

9 10 5 10 20 4 14 19 14 0 15 20 10 6 25 0 30

9 10 5 10 20 4 14 19 14 0 15 20 10 6 25 0 30

0 13 14 9 14 24

0 13 14 9 14 24

9 13 0

9 13 0

5 10 14 4 10 5

5 10 14 4 10 5

9 14 15 0

9 14 15 0

15 10 14 19 20 6

15 10 14 19 20 6

15 20 24 14 10 25 30 0 s 0 5 9 4 a 5 0 4 b 9 4 c 4 d e f g

15 20 24 14 10 25 30 0 s a 5 0 4 b 9 4 c 4 d e f g

5 10 15 15

s a b c d e f g

0 5 9 4

5 10 15 15

9 10 5 10 20 4 14 19 14 0 15 20 10 6 25 0 30

9 10 5 10 20 4 14 19 14 0 15 20 10 6 25 0 30

0 13 14 9 14 24

0 13 14 9 14 24

9 13 0

9 13 0

5 10 14 4 10 5

5 10 14 4 10 5

9 14 15 0

9 14 15 0

15 10 14 19 20 6

15 10 14 19 20 6

15 20 24 14 10 25 30 0

15 20 24 14 10 25 30 0

Figure 62: Matrix after each iteration. The k-th row and colum are shaded and the new, improved distances are high-lighted.

5 10 4 0 4 0

The algorithm works for weighted undirected as well as for weighted directed graphs. Its correctness is easily verified inductively. The running time is O(n3 ).

10 4

Figure 61: Adjacency, or distance matrix of the graph in Figure 57. All blank entries store ∞.

52

15

Minimum Spanning Trees
1.5

a
1.3 1.4 2.5 1.6 1.2 0.9

3.6

b
1.9

When a graph is connected, we may ask how many edges we can delete before it stops being connected. Depending on the edges we remove, this may happen sooner or later. The slowest strategy is to remove edges until the graph becomes a tree. Here we study the somewhat more difficult problem of removing edges with a maximum total weight. The remaining graph is then a tree with minimum total weight. Applications that motivate this question can be found in life support systems modeled as graphs or networks, such as telephone, power supply, and sewer systems. Free trees. An undirected graph (U, T ) is a free tree if it is connected and contains no cycle. We could impose a hierarchy by declaring any one vertex as the root and thus obtain a rooted tree. Here, we have no use for a hierarchical organization and exclusively deal with free trees. The a e c d f b

e
1.4 1.2 1.3

c

d
1.6

f
1.1

2.8

g

h

i

Figure 64: The bold edges form a spanning tree of weight 0.9 + 1.2 + 1.3 + 1.4 + 1.1 + 1.2 + 1.6 + 1.9 = 10.6.

more edges. Let A ⊆ E be a subset of some MST of a connected graph (V, E). An edge uv ∈ E − A is safe for A if A ∪ {uv} is also subset of some MST. The generic algorithm adds safe edges until it arrives at an MST. A = ∅; while (V, A) is not a spanning tree do find a safe edge uv; A = A ∪ {uv} endwhile. As long as A is a proper subset of an MST there are safe edges. Specifically, if (V, T ) is an MST and A ⊆ T then all edges in T − A are safe for A. The algorithm will therefore succeed in constructing an MST. The only thing that is not yet clear is how to find safe edges quickly.

g

h

i

Figure 63: Adding the edge dg to the tree creates a single cycle with vertices d, g, h, f, e, a.

number of edges of a free tree is always one less than the number of vertices. Whenever we add a new edge (connecting two old vertices) we create exactly one cycle. This cycle can be destroyed by deleting any one of its edges, and we get a new free tree, as in Figure 63. Let (V, E) be a connected and undirected graph. A subgraph is another graph (U, T ) with U ⊆ V and T ⊆ E. It is a spanning tree if it is a free tree with U = V . Minimum spanning trees. For the remainder of this section, we assume that we also have a weighting function, w : E → R. The weight of subgraph is then the total weight of its edges, w(T ) = e∈T w(e). A minimum spanning tree, or MST of G is a spanning tree that minimizes the weight. The definitions are illustrated in Figure 64 which shows a graph of solid edges with a minimum spanning tree of bold edges. A generic algorithm for constructing an MST grows a tree by adding more and

Cuts. To develop a mechanism for identifying safe edges, we define a cut, which is a partition of the vertex ˙ set into two complementary sets, V = W ∪ (V −W ). It is crossed by an edge uv ∈ E if u ∈ W and v ∈ V −W , and it respects an edge set A if A contains no crossing edge. The definitions are illustrated in Figure 65.

Figure 65: The vertices inside and outside the shaded regions form a cut that respects the collection of solid edges. The dotted edges cross the cut.

53

C UT L EMMA . Let A be subset of an MST and consider a ˙ cut W ∪ (V − W ) that respects A. If uv is a crossing edge with minimum weight then uv is safe for A. P ROOF. Consider a minimum spanning tree (V, T ) with A ⊆ T . If uv ∈ T then we are done. Otherwise, let T ′ = T ∪ {uv}. Because T is a tree, there is a unique path from u to v in T . We have u ∈ W and v ∈ V − W , so the path switches at least once between the two sets. Suppose it switches along xy, as in Figure 66. Edge xy x u y v

The main algorithm expands the tree by one edge at a time. It uses marks to distinguish vertices in the tree from vertices outside the tree. while priority queue is non-empty do i = E XTRACT M IN ; mark i; forall neighbors j of i do if j is unmarked and w(ij) < V [j].d then V [j].d = w(ij); V [j].π = i endif endfor endwhile. After running the algorithm, the MST can be recovered from the π-fields of the vertices. The algorithm together with its initialization phase performs n = |V | insertions into the priority queue, n extractmin operations, and at most m = |E| decreasekey operations. Using the Fibonacci heap implementation, we get a running time of O(n log n + m), which is the same as for constructing the shortest-path tree with Dijkstra’s algorithm. Kruskal’s algorithm. Kruskal’s algorithm is another implementation of the generic algorithm. It adds edges in a sequence of non-decreasing weight. At any moment, the chosen edges form a collection of trees. These trees merge to form larger and fewer trees, until they eventually combine into a single tree. The algorithm uses a priority queue for the edges and a set system for the vertices. In this context, the term ‘system’ is just another word for ‘set’, but we will use it exclusively for sets whose elements are themselves sets. Implementations of the set system will be discussed in the next lecture. Initially, A = ∅, the priority queue contains all edges, and the system contains a singleton set for each vertex, C = {{u} | u ∈ V }. The algorithm finds an edge with minimum weight that connects two components defined by A. We set W equal to the vertex set of one component and use the Cut Lemma to show that this edge is safe for A. The edge is added to A and the process is repeated. The algorithm halts when only one tree is left, which is the case when A contains n − 1 = |V | − 1 edges. A = ∅; while |A| < n − 1 do uv = E XTRACT M IN ; find P, Q ∈ C with u ∈ P and v ∈ Q; if P = Q then A = A ∪ {uv}; merge P and Q endif endwhile.

Figure 66: Adding uv creates a cycle and deleting xy destroys the cycle.

crosses the cut, and since A contains no crossing edges we have xy ∈ A. Because uv has minimum weight among crossing edges we have w(uv) ≤ w(xy). Define T ′′ = T ′ − {xy}. Then (V, T ′′ ) is a spanning tree and because w(T ′′ ) = w(T ) − w(xy) + w(uv) ≤ w(T ) it is a minimum spanning tree. The claim follows because A ∪ {uv} ⊆ T ′′ . A typical application of the Cut Lemma takes a component of (V, A) and defines W as the set of vertices of that component. The complementary set V − W contains all other vertices, and crossing edges connect the component with its complement. Prim’s algorithm. Prim’s algorithm chooses safe edges to grow the tree as a single component from an arbitrary first vertex s. Similar to Dijkstra’s algorithm, the vertices that do not yet belong to the tree are stored in a priority queue. For each vertex i outside the tree, we define its priority V [i].d equal to the minimum weight of any edge that connects i to a vertex in the tree. If there is no such edge then V [i].d = ∞. In addition to the priority, we store the index of the other endpoint of the minimum weight edge. We first initialize this information. V [s].d = 0; V [s].π = −1; I NSERT(s); forall vertices i = s do V [i].d = ∞; I NSERT(i) endfor.

54

The running time is O(m log m) for the priority queue operations plus some time for maintaining C. There are two operations for the set system, namely finding the set that contains a given element, and merging two sets into one. An example. We illustrate Kruskal’s algorithm by applying it to the weighted graph in Figure 64. The sequence of edges sorted by weight is cd, f i, f h, ad, ae, hi, de, ef , ac, gh, dg, bf , eg, bi, ab. The evolution of the set system

a

b

c

d

e f

g

h

i

Figure 67: Eight union operations merge the nine singleton sets into one set.

is illustrated in Figure 67, and the MST computed with Kruskal’s algorithm and indicated with dotted edges is the same as in Figure 64. The edges cd, f i, f h, ad, ae are all added to the tree. The next two edge, hi and de, are not added because they each have both endpoints in the same component, and adding either edge would create a cycle. Edge ef is added to the tree giving rise to a set in the system that contains all vertices other than g and b. Edge ac is not added, gh is added, dg is not, and finally bf is added to the tree. At this moment the system consists of a single set that contains all vertices of the graph. As suggested by Figure 67, the evolution of the construction can be interpreted as a hierarchical clustering of the vertices. The specific method that corresponds to the evolution created by Kruskal’s algorithm is referred to as single-linkage clustering.

55

16 Union-Find
1 2 5 9

In this lecture, we present two data structures for the disjoint set system problem we encountered in the implementation of Kruskal’s algorithm for minimum spanning trees. An interesting feature of the problem is that m operations can be executed in a time that is only ever so slightly more than linear in m.

10 6

3

4 12

8
7

11

1

2

3

4

5

6

7

8

9

10

11

12

C.set C.size C.next

3

3

3
5

8

8

3 11 8 11 3 11 8
4 3

Abstract data type. A disjoint set system is an abstract data type that represents a partition C of a set [n] = {1, 2, . . . , n}. In other words, C is a set of pairwise disjoint subsets of [n] such that the union of all sets in C is [n]. The data type supports set F IND (i): return P ∈ C with i ∈ P ; void U NION (P, Q) : C = C − {P, Q} ∪ {P ∪ Q}. In most applications, the sets themselves are irrelevant, and it is only important to know when two elements belong to the same set and when they belong to different sets in the system. For example, Kruskal’s algorithm executes the operations only in the following sequence: P = F IND (i); Q = F IND (j); if P = Q then U NION(P, Q) endif. This is similar to many everyday situations where it is usually not important to know what it is as long as we recognize when two are the same and when they are different.

Figure 68: The system consists of three sets, each named by the bold element. Each element stores the name of its set, possibly the size of its set, and possibly a pointer to the next element in the same set.

void U NION (int P, Q) if C[P ].size < C[Q].size then P ↔ Q endif; C[P ].size = C[P ].size + C[Q].size; second = C[P ].next; C[P ].next = Q; t = Q; while t = 0 do C[t].set = P ; u = t; t = C[t].next endwhile; C[u].next = second. In the worst case, a single U NION operation takes time Θ(n). The amortized performance is much better because we spend time only on the elements of the smaller set. W EIGHTED U NION L EMMA . n − 1 U NION operations applied to a system of n singleton sets take time O(n log n). P ROOF. For an element, i, we consider the cardinality of the set that contains it, σ(i) = C[F IND (i)].size. Each time the name of the set that contains i changes, σ(i) at least doubles. After changing the name k times, we have σ(i) ≥ 2k and therefore k ≤ log2 n. In other words, i can be in the smaller set of a U NION operation at most log2 n times. The claim follows because a U NION operation takes time proportional to the cardinality of the smaller set.

Linked lists. We construct a fairly simple and reasonably efficient first solution using linked lists for the sets. We use a table of length n, and for each i ∈ [n], we store the name of the set that contains i. Furthermore, we link the elements of the same set and use the name of the first element as the name of the set. Figure 68 shows a sample set system and its representation. It is convenient to also store the size of the set with the first element. To perform a U NION operation, we need to change the name for all elements in one of the two sets. To save time, we do this only for the smaller set. To merge the two lists without traversing the longer one, we insert the shorter list between the first two elements of the longer list.

Up-trees. Thinking of names as pointers, the above data structure stores each set in a tree of height one. We can use more general trees and get more efficient U NION operations at the expense of slower F IND operations. We consider a class of algorithms with the following commonalities:

56

• each set is a tree and the name of the set is the index of the root; • F IND traverses a path from a node to the root; • U NION links two trees.

int F IND (int i) if C[i].p = i then return F IND(C[i].p) endif; return i. void U NION (int i, j) if C[i].size < C[j].size then i ↔ j endif; C[i].size = C[i].size + C[j].size; C[j].p = i. The size of a subtree increases by at least a factor of 2 from a node to its parent. The depth of a node can therefore not exceed log2 n. It follows that F IND takes at most time O(log n). We formulate the result on the height for later reference. H EIGHT L EMMA . An up-tree created from n singleton nodes by n − 1 weighted union operations has height at most log2 n. Path compression. We can further improve the time for F IND operations by linking traversed nodes directly to the root. This is the idea of path compression. The U NION operation is implemented as before and there is only one modification in the implementation of the F IND operation: int F IND (int i) if C[i].p = i then C[i].p = F IND(C[i].p) endif; return C[i].p.

It suffices to store only one pointer per node, namely the pointer to the parent. This is why these trees are called up-trees. It is convenient to let the root point to itself.

10 6 5 1 7 3 2 4 11 8 12 9

Figure 69: The U NION operations create a tree by linking the root of the first set to the root of the second set.

1

2

3

4

5

6

7

8

9

10

11

12

Figure 70: The table stores indices which function as pointers as well as names of elements and of sets. The white dot represents a pointer to itself.

2

3

2 3

2 1

4 6 3

2 4

1 6

Figure 69 shows the up-tree generated by executing the following eleven U NION operations on a system of twelve singleton sets: 2 ∪ 3, 4 ∪ 7, 2 ∪ 4, 1 ∪ 2, 4 ∪ 10, 9 ∪ 12, 12 ∪ 2, 8 ∪ 11, 8 ∪ 2, 5 ∪ 6, 6 ∪ 1. Figure 70 shows the embedding of the tree in a table. U NION takes constant time and F IND takes time proportional to the length of the path, which can be as large as n − 1. Weighted union. The running time of F IND can be improved by linking smaller to larger trees. This is the ide of weighted union again. Assume a field C[i].p for the index of the parent (C[i].p = i if i is a root), and a field C[i].size for the number of elements in the tree rooted at i. We need the size field only for the roots and we need the index to the parent field everywhere except for the roots. The F IND and U NION operations can now be implemented as follows:

2 5

6 7 3

2 4 1 6

5 7

4

6

2 3 4 1 6

5 7

7

3

2 4 1 6 5 7

7

8

2 4 1 6 5 7 8

3

3

Figure 71: The operations and up-trees develop from top to bottom and within each row from left to right.

If i is not root then the recursion makes it the child of a root, which is then returned. If i is a root, it returns itself

57

because in this case C[i].p = i, by convention. Figure 71 illustrates the algorithm by executing a sequence of eight operations i ∪ j, which is short for finding the sets that contain i and j, and performing a U NION operation if the sets are different. At the beginning, every element forms its own one-node tree. With path compression, it is difficult to imagine that long paths can develop at all. Iterated logarithm. We will prove shortly that the iterated logarithm is an upper bound on the amortized time for a F IND operation. We begin by defining the function from its inverse. Let F (0) = 1 and F (i + 1) = 2F (i) . We 2 have F (1) = 2, F (2) = 22 , and F (3) = 22 . In general, F (i) is the tower of i 2s. Table 5 shows the values of F for the first six arguments. For i ≤ 3, F is very small, but i F 0 1 1 2 2 4 3 16 4 65, 536 5 265,536

Note that if µ is a proper descendent of another node ν at some moment during the execution of the operation sequence then µ is a proper descendent of ν in T . In this case λ(µ) < λ(ν).
18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 4 4 3 3 3 3 3 3 3 3 3 3 3 3 2 2 1 0

Table 5: Values of F .

for i = 5 it already exceeds the number of atoms in our universe. Note that the binary logarithm of a tower of i 2s is a tower of i − 1 2s. The iterated logarithm is the number of times we can take the binary logarithm before we drop down to one or less. In other words, the iterated logarithm is the inverse of F , log∗ n = = min{i | F (i) ≥ n}

Figure 72: A schematic drawing of the tree T between the column of level numbers on the left and the column of group numbers on the right. The tree is decomposed into five groups, each a sequences of contiguous levels.

min{i | log2 log2 . . . log2 n ≤ 1},

where the binary logarithm is taken i times. As n goes to infinity, log∗ n goes to infinity, but very slowly. Levels and groups. The analysis of the path compression algorithm uses two Census Lemmas discussed shortly. Let A1 , A2 , . . . , Am be a sequence of U NION and F IND operations, and let T be the collection of up-trees we get by executing the sequence, but without path compression. In other words, the F IND operations have no influence on the trees. The level λ(µ) of a node µ is its height of its subtree in T plus one. L EVEL C ENSUS L EMMA . There are at most n/2ℓ−1 nodes at level ℓ. P ROOF. We use induction to show that a node at level ℓ has a subtree of at least 2ℓ−1 nodes. The claim follows because subtrees of nodes on the same level are disjoint.

Define the group number of a node µ as the iterated logarithm of the level, g(µ) = log∗ λ(µ). Because the level does not exceed n, we have g(µ) ≤ log∗ n, for every node µ in T . The definition of g decomposes an up-tree into at most 1 + log∗ n groups, as illustrated in Figure 72. The number of levels in group g is F (g)−F (g −1), which gets large very fast. On the other hand, because levels get smaller at an exponential rate, the number of nodes in a group is not much larger than the number of nodes in the lowest level of that group. G ROUP C ENSUS L EMMA . There are at most 2n/F (g) nodes with group number g. P ROOF. Each node with group number g has level between F (g − 1) + 1 and F (g). We use the Level Census Lemma to bound their number:
F (g)

n 2ℓ−1

ℓ=F (g−1)+1

≤ =

1 1 n · (1 + 2 + 4 + . . .) 2F (g−1)

2n , F (g)

as claimed. Analysis. The analysis is based on the interplay between the up-trees obtained with and without path compression.

58

The latter are constructed by the weighted union operations and eventually form a single tree, which we denote as T . The former can be obtained from the latter by the application of path compression. Note that in T , the level strictly increases from a node to its parent. Path compression preserves this property, so levels also increase when we climb a path in the actual up-trees. We now show that any sequence of m ≥ n U NION and F IND operations on a ground set [n] takes time at most O(m log∗ n) if weighted union and path compression is used. We can focus on F IND because each U NION operation takes only constant time. For a F IND operation Ai , let Xi be the set of nodes along the traversed path. The total time for executing all F IND operations is proportional to x = i Summary. We proved an upper bound on the time needed for m ≥ n U NION and F IND operations. The bound is more than constant per operation, although for all practical purposes it is constant. The log∗ n bound can be improved to an even smaller function, usually referred to as α(n) or the inverse of the Ackermann function, that goes to infinity even slower than the iterated logarithm. It can also be proved that (under some mild assumptions) there is no algorithm that can execute general sequences of U NION and F IND operations in amortized time that is asymptotically less than α(n).

|Xi |.

For µ ∈ Xi , let pi (µ) be the parent during the execution of Ai . We partition Xi into the topmost two nodes, the nodes just below boundaries between groups, and the rest: Yi Zi Wi = = = {µ ∈ Xi | µ is root or child of root},

{µ ∈ Xi − Yi | g(µ) < g(pi (µ))}, {µ ∈ Xi − Yi | g(µ) = g(pi (µ))}.

Clearly, |Yi | ≤ 2 and |Zi | ≤ log∗ n. It remains to bound the total size of the Wi , w = i |Wi |. Instead of counting, for each Ai , the nodes in Wi , we count, for each node µ, the F IND operations Aj for which µ ∈ Wj . In other words, we count how often µ can change parent until its parent has a higher group number than µ. Each time µ changes parent, the new parent has higher level than the old parent. If follows that the number of changes is at most F (g(µ)) − F (g(µ) − 1). The number of nodes with group number g is at most 2n/F (g) by the Group Census Lemma. Hence log∗ n

w

≤ ≤

g=0

2n · (F (g) − F (g − 1)) F (g)

2n · (1 + log∗ n).

This implies that x ≤ 2m + m log∗ n + 2n(1 + log∗ n) = O(m log∗ n),

assuming m ≥ n. This is an upper bound on the total time it takes to execute m F IND operations. The amortized cost per F IND operation is therefore at most O(log∗ n), which for all practical purposes is a constant.

59

Fourth Homework Assignment
Write the solution to each problem on a single page. The deadline for handing in solutions is October 30. Problem 1. (20 = 10 + 10 points). Consider a free tree and let d(u, v) be the number of edges in the path connecting u to v. The diameter of the tree is the maximum d(u, v) over all pairs of vertices in the tree. (a) Give an efficient algorithm to compute the diameter of a tree. (b) Analyze the running time of your algorithm. Problem 2. (20 points). Design an efficient algorithm to find a spanning tree for a connected, weighted, undirected graph such that the weight of the maximum weight edge in the spanning tree is minimized. Prove the correctness of your algorithm. Problem 3. (7 + 6 + 7 points). A weighted graph G = (V, E) is a near-tree if it is connected and has at most n + 8 edges, where n is the number of vertices. Give an O(n)-time algorithm to find a minimum weight spanning tree for G. Problem 4. (10 + 10 points). Given an undirected weighted graph and vertices s, t, design an algorithm that computes the number of shortest paths from s to t in the case: (a) All weights are positive numbers. (b) All weights are real numbers. Analyze your algorithm for both (a) and (b). Problem 5. (20 = 10 + 10 points). The off-line minimum problem is about maintaining a subset of [n] = {1, 2, . . . , n} under the operations I NSERT and E X TRACT M IN. Given an interleaved sequence of n insertions and m min-extractions, the goal is to determine which key is returned by which min-extraction. We assume that each element i ∈ [n] is inserted exactly once. Specifically, we wish to fill in an array E[1..m] such that E[i] is the key returned by the ith min-extraction. Note that the problem is off-line, in the sense that we are allowed to process the entire sequence of operations before determining any of the returned keys. (a) Describe how to use a union-find data structure to solve the problem efficiently.

(b) Give a tight bound on the worst-case running time of your algorithm.

60

V

T OPOLOGICAL A LGORITHMS

17 18 19

Geometric Graphs Surfaces Homology Fifth Homework Assignment

61

17 Geometric Graphs
In the abstract notion of a graph, an edge is merely a pair of vertices. The geometric (or topological) notion of a graph is closer to our intuition in which we think of an edge as a curve that connects two vertices. Embeddings. Let G = (V, E) be a simple, undirected graph and write R2 for the two-dimensional real plane. A drawing maps every vertex v ∈ V to a point ε(v) in R2 , and it maps every edge {u, v} ∈ E to a curve with endpoints ε(u) and ε(v). The drawing is an embedding if 1. different vertices map to different points; 2. the curves have no self-intersections; 3. the only points of a curve that are images of vertices are its endpoints; 4. two curves intersect at most in their endpoints. We can always map the vertices to points and the edges to curves in R3 so they form an embedding. On the other hand, not every graph has an embedding in R2 . The graph G is planar if it has an embedding in R2 . As illustrated in Figure 73, a planar graph has many drawings, not all of which are embeddings. A straight-line drawing or embed-

P ROOF. Choose a spanning tree (V, T ) of G = (V, E). It has n vertices, |T | = n − 1 edges, and one (unbounded) face. We have n − (n − 1) + 1 = 2, which proves the formula if G is a tree. Otherwise, draw the remaining edges, one at a time. Each edge decomposes one face into two. The number of vertices does not change, m increases by one, and ℓ increases by one. Since the graph satisfies the linear relation before drawing the edge, it satisfies the relation also after drawing the edge. A planar graph is maximally connected if adding any one new edge violates planarity. Not surprisingly, a planar graph of three or more vertices is maximally connected iff every face in an embedding is bounded by three edges. Indeed, suppose there is a face bounded by four or more edges. Then we can find two vertices in its boundary that are not yet connected and we can connect them by drawing a curve that passes through the face; see Figure 74. For obvious reasons, we call an embedding of a maxi-

a

d

b

c

Figure 74: Drawing the edge from a to c decomposes the quadrangle into two triangles. Note that we cannot draw the edge from b to d since it already exists outside the quadrangle.

Figure 73: Three drawings of K4 , the complete graph with four vertices. From left to right: a drawing that is not an embedding, an embedding with one curved edge, a straight-line embedding.

mally connected planar graph with n ≥ 3 vertices a triangulation. For such graphs, we have an additional linear relation, namely 3ℓ = 2m. We can thus rewrite Euler’s formula and get n − m + 2m = 2 and n − 3ℓ + ℓ = 2 and 3 2 therefore m ℓ = = 3n − 6;

ding is one in which each edge is mapped to a straight line segment. It is uniquely determined by the mapping of the vertices, ε : V → R2 . We will see later that every planar graph has a straight-line embedding. Euler’s formula. A face of an embedding ε of G is a component of the thus defined decomposition of R2 . We write n = |V |, m = |E|, and ℓ for the number of faces. Euler’s formula says these numbers satisfy a linear relation. E ULER ’ S F ORMULA . If G is connected and ε is an embedding of G in R2 then n − m + ℓ = 2.

2n − 4,

Every planar graph can be completed to a maximally connected planar graph. For n ≥ 3 this implies that the planar graph has at most 3n − 6 edges and at most 2n − 4 faces. Forbidden subgraphs. We can use Euler’s relation to prove that the complete graph of five vertices is not planar. It has n = 5 vertices and m = 10 edges, contradicting the upper bound of at most 3n − 6 = 9 edges. Indeed, every drawing of K5 has at least two edges crossing; see Figure 75. Similarly, we can prove that the complete bipartite

62

y x z

Figure 75: A drawing of K5 on the left and of K3,3 on the right.

Figure 76: A convex region on the left and a non-convex starconvex region on the right.

graph with three plus three vertices is not planar. It has n = 6 vertices and m = 9 edges. Every cycle in a bipartite graph has an even number of edges. Hence, 4ℓ ≤ 2m. Plugging this into Euler’s formula, we get n − m + m ≥ 2 2 and therefore m ≤ 2n − 4 = 8, again a contradiction.

In a sense, K5 and K3,3 are the quintessential nonplanar graphs. To make this concrete, we still need an operation that creates or removes degree-2 vertices. Two graphs are homeomorphic if one can be obtained from the other by a sequence of operations, each deleting a degree-2 vertex and replacing its two edges by the one that connects its two neighbors, or the other way round. K URATOWSKI’ S T HEOREM . A graph G is planar iff no subgraph of G is homeomorphic to K5 or to K3,3 .

such points z is the kernel of R. Clearly, every convex region is star-convex but not every star-convex region is convex. Similarly, there are regions that are not star-convex, even rather simple ones such as the hexagon in Figure 77. However, every pentagon is star-convex. Indeed, the pen-

z

Figure 77: A non-star-convex hexagon on the left and a starconvex pentagon on the right. The dark region inside the pentagon is its kernel.

The proof of this result is a bit lengthy and omitted. tagon can be decomposed into three triangles by drawing two diagonals that share an endpoint. Extending the incident sides into the pentagon gives locally the boundary of the kernel. It follows that the kernel is non-empty and has interior points.

Pentagons are star-convex. Euler’s formula can also be used to show that every planar graph has a straight-line embedding. Note that the sum of vertex degrees counts each edge twice, that is, v∈V deg(v) = 2m. For planar graphs, twice the number of edges is less than 6n which implies that the average degree is less than six. It follows that every planar graph has at least one vertex of degree 5 or less. This can be strengthened by saying that every planar graph with n ≥ 4 vertices has at least four vertices of degree at most 5 each. To see this, assume the planar graph is maximally connected and note that every vertex has degree at least 3. The deficiency from degree 6 is thus at most 3. The total deficiency is 6n − v∈V deg(v) = 12 which implies that we have at least four vertices with positive deficiency. We need a little bit of geometry to prepare the construction of a straight-line embedding. A region R ⊆ R2 is convex if x, y ∈ R implies that the entire line segment connecting x and y is contained in R. Figure 76 shows regions of either kind. We call R star-convex of there is a point z ∈ R such that for every point x ∈ R the line segment connecting x with z is contained in R. The set of

F´ ry’s construction. We construct a straight-line ema bedding of a planar graph G = (V, E) assuming G is maximally connected. Choose three vertices, a, b, c, connected by three edges to form the outer triangle. If G has only n = 3 vertices we are done. Else it has at least one vertex u ∈ V = {a, b, c} with deg(u) ≤ 5. Step 1. Remove u together with the k = deg(u) edges incident to u. Add k − 3 edges to make the graph maximally connected again. Step 2. Recursively construct a straight-line embedding of the smaller graph. Step 3. Remove the added k − 3 edges and map u to a point ε(u) in the interior of the kernel of the resulting k-gon. Connect ε(u) with line segments to the vertices of the k-gon.

63

Figure 78 illustrates the recursive construction. It is straightforward to implement but there are numerical issues in the choice of ε(u) that limit the usefulness of this construction. c v u w a x b recurse The fact that the resulting mapping ε : V → R2 gives a straight-line embedding of G is known as Tutte’s Theorem. It holds even if G is not quite maximally connected and if the points are not quite the averages of their neighbors. The proof is a bit involved and omitted. The points ε(u) can be computed by solving a system of linear equations. We illustrate this for the graph in Figure 1 78. We set ε(a) = −1 , ε(b) = −1 , ε(c) = 0 . The −1 1 other five points are computed by solving the system of linear equations Av = 0, where   1 1 1 1 0 0 1 −5  0 0 1 1 −3 1 0 0     1 1 1 1 1 −6 1 0  A =    0 1 1 1 0 1 −5 1  0 0 1 1 0 0 1 −3

y remove u

add back u

Figure 78: We fix the outer triangle, remove the degree-5 vertex, recursively construct a straight-line embedding of the rest, and finally add the vertex back.

Tutte’s construction. A more useful construction of a straight-line embedding goes back to the work of Tutte. We begin with a definition. Given a finite set of points, x1 , x2 , . . . , xj , the average is x = 1 n j and v is the column vector of points ε(a) to ε(y). There are really two linear systems, one for the horizontal and the other for the vertical coordinates. In each system, we have n − 3 equations and a total of n − 3 unknowns. This gives a unique solution provided the equations are linearly independent. Proving that they are is part of the proof of Tutte’s Theorem. Solving the linear equations is a numerical problem that is studies in detail in courses on numerical analysis.

xi . i=1 For j = 2, it is the midpoint of the edge and for j = 3, it is the centroid of the triangle. In general, the average is a point somewhere between the xi . Let G = (V, E) be a maximally connected planar graph and a, b, c three vertices connected by three edges. We now follow Tutte’s construction to get a mapping ε : V → R2 so that the straight-line drawing of G is a straight-line embedding. Step 1. Map a, b, c to points ε(a), ε(b), ε(c) spanning a triangle in R2 . Step 2. For each vertex u ∈ V − {a, b, c}, let Nu be the set of neighbors of u. Map u to the average of the images of its neighbors, that is, ε(u) = 1 |Nu | ε(v). v∈Nu 64

18 Surfaces
Graphs may be drawn in two, three, or higher dimensions, but they are still intrinsically only 1-dimensional. One step up in dimensions, we find surfaces, which are 2-dimensional. Topological 2-manifolds. The simplest kind of surfaces are the ones that on a small scale look like the real plane. A space M is a 2-manifold if every point x ∈ M is locally homeomorphic to R2 . Specifically, there is an open neighborhood N of x and a continuous bijection h : N → R2 whose inverse is also continuous. Such a bicontinuous map is called a homeomorphism. Examples of 2-manifolds are the open disk and the sphere. The former is not compact because it has covers that do not have finite subcovers. Figure 79 shows examples of compact 2manifolds. If we add the boundary circle to the open disk

Triangulations. A standard representation of a compact 2-manifold uses triangles that are glued to each other along shared edges and vertices. A collection K of triangles, edges, and vertices is a triangulation of a compact 2-manifold if I. for every triangle in K, its three edges belong to K, and for every edge in K, its two endpoints are vertices in K; II. every edge belongs to exactly two triangles and every vertex belongs to a single ring of triangles. An example is shown in Figure 81. To simplify language, we call each element of K a simplex. If we need to be specific, we add the dimension, calling a vertex a 0-simplex, an edge a 1-simplex, and a triangle a 2-simplex. A face of a simplex τ is a simplex σ ⊆ τ . For example, a triangle has seven faces, its three vertices, its two edges, and itself. We can now state Condition I more succinctly: if σ is a face of τ and τ ∈ K then σ ∈ K. To talk about

Figure 79: Three compact 2-manifolds, the sphere, the torus, and the double torus.

we get a closed disk which is compact but not every point is locally homeomorphic to R2 . Specifically, a point on the circle has an open neighborhood homeomorphic to the closed half-plane, H2 = {(x1 , x2 ) ∈ R2 | x1 ≥ 0}. A space whose points have open neighborhoods homeomorphic to R2 or H2 is called a 2-manifolds with boundary; see Figure 80 for examples. The boundary is the subset

Figure 81: A triangulation of the sphere. The eight triangles are glued to form the boundary of an octahedron which is homeomorphic to the sphere.

Figure 80: Three 2-manifolds with boundary, the closed disk, the cylinder, and the M¨ bius strip. o

of points with neighborhoods homeomorphic to H2 . It is a 1-manifold (without boundary), that is, every point is locally homeomorphic to R. There is only one type of compact, connected 1-manifold, namely the closed curve. In topology, we do not distinguish spaces that are homeomorphic to each other. Hence, every closed curve is like every other one and they are all homeomorphic to the unit circle, S1 = {x ∈ R2 | x = 1}.

the inverse of the face relation, we define the star of a simplex σ as the set of simplices that contain σ as a face, St σ = {τ ∈ K | σ ⊆ τ }. Sometimes we think of the star as a set of simplices and sometimes as a set of points, namely the union of interiors of the simplices in the star. With the latter interpretation, we can now express Condition II more succinctly: the star of every simplex in K is homeomorphic to R2 . Data structure. When we store a 2-manifold, it is useful to keep track of which side we are facing and where we are going so that we can move around efficiently. The core piece of our data structure is a representation of the symmetry group of a triangle. This group is isomorphic to the group of permutations of three elements,

65

the vertices of the triangle. We call each permutation an ordered triangle and use cyclic shifts and transpositions to move between them; see Figure 82. We store
ENEXT

M¨ bious strip in Figure 80. There are also non-orientable, o compact 2-manifolds (without boundary), as we can see in Figure 83. We use the data structure to decide whether or

c
ENEXT

a
ENEXT

b

a
SYM

b c

b
SYM

c a

c
SYM

a b

ENEXT

ENEXT

Figure 83: Two non-orientable, compact 2-manifolds, the projective plane on the left and the Klein bottle on the right. c b

a

c
ENEXT

b

a

Figure 82: The symmetry group of the triangle consists of six ordered versions. Each ordered triangle has a lead vertex and a lead directed edge.

the entire symmetry group in a single node of an abstract graph, with arcs between neighboring triangles. Furthermore, we store the vertices in a linear array, V [1..n]. For each ordered triangle, we store the index of the lead vertex and a pointer to the neighboring triangle that shares the same directed lead edge. A pointer in this context is the address of a node together with a three-bit integer, ι, that identifies the ordered version of the triangle we refer to. Suppose for example that we identify the ordered versions abc, bca, cab, bac, cba, acb of a triangle with ι = 0, 1, 2, 4, 5, 6, in this sequence. Then we can move between different ordered versions of the same triangle using the following functions. ordTri ENEXT(µ, ι) if ι ≤ 2 then return (µ, (ι + 1) mod 3) else return (µ, (ι + 1) mod 3 + 4) endif. ordTri SYM(µ, ι) return (µ, (ι + 4) mod 8). To get the index of the lead vertex, we use the integer function ORG(µ, ι) and to get the pointer to the neighboring triangle, we use FNEXT(µ, ι). Orientability. A 2-manifold is orientable if it has two distinct sides, that is, if we move around on one we stay there and never cross over to the other side. The one example of a non-orientable manifold we have seen so far is the

not a 2-manifold is orientable. Note that the cyclic shift partitions the set of six ordered triangles into two orientations, each consisting of three triangles. We say two neighboring triangles are consistently oriented if they disagree on the direction of the shared edge, as in Figure 81. Using depth-first search, we visit all triangles and orient them consistently, if possible. At the first visit, we orient the triangle consistent with the preceding, neighboring triangle. At subsequence visits, we check for consistent orientation. boolean IS O RNTBL (µ, ι) if µ is unmarked then mark µ; choose the orientation that contains ι; bx = IS O RNTBL (FNEXT(SYM (µ, ι))); by = IS O RNTBL (FNEXT(ENEXT(SYM(µ, ι)))); bz = IS O RNTBL(FNEXT(ENEXT2 (SYM (µ, ι)))); return bx and by and bz else return [orientation of µ contains ι] endif. There are two places where we return a boolean value. At the second place, it indicates whether or not we have consistent orientation in spite of the visited triangle being oriented prior to the visit. At the first place, the boolean value indicates whether or not we have found a contradiction to orientablity so far. A value of FALSE anywhere during the computation is propagated to the root of the search tree telling us that the 2-manifold is non-orientable. The running time is proportional to the number of triangles in the triangulation of the 2-manifold. Classification. For the sphere and the torus, it is easy to see how to make them out of a sheet of paper. Twisting the paper gives a non-orientable 2-manifold. Perhaps

66

most difficult to understand is the projective plane. It is obtained by gluing each point of the sphere to its antipodal point. This way, the entire northern hemisphere is glued to the southern hemisphere. This gives the disk except that we still need to glue points of the bounding circle (the equator) in pairs, as shown in the third paper construction in Figure 84. The Klein bottle is easier to imagine as it is obtained by twisting the paper just once, same as in the construction of the M¨ bius strip. o b a a a

C LASSIFICATION T HEOREM . The members of the families S2 , T2 , T2 #T2 , . . . and P2 , P2 #P2 , . . . are nonhomeomorphic and they exhaust the family of compact 2-manifolds. Euler characteristic. Suppose we are given a triangulation, K, of a compact 2-manifold, M. We already know how to decide whether or not M is orientable. To determine its type, we just need to find its genus, which we do by counting simplices. The Euler characteristic is χ = #vertices − #edges + #triangles.

b

a b

b b

b b

b

a

a

a

a

Figure 84: From left to right: the sphere, the torus, the projective plane, and the Klein bottle.

Let us look at the orientable case first. We have a 4g-gon which we triangulate. This is a planar graph with n − m + ℓ = 2. However, 2g edge are counted double, the 4g vertices of the 4g-gon are all the same, and the outer face is not a triangle in K. Hence, χ = (n − 4g + 1) − (m − 2g) + (ℓ − 1) = (n − m + ℓ) − 2g

There is a general method here that can be used to classify the compact 2-manifolds. Given two of them, we construct a new one by removing an open disk each and gluing the 2-manifolds along the two circles. The result is called the connected sum of the two 2-manifolds, denoted as M#N. For example, the double torus is the connected sum of two tori, T2 #T2 . We can cut up the g-fold torus into a flat sheet of paper, and the canonical way of doing this gives a 4g-gon with edges identified in pairs as shown in Figure 85 on the left. The number g is called the genus of the manifold. Similarly, we can get new non-orientable a1 b2 b1 a4 a1 a1

which is equal to 2 − 2g. The same analysis can be used in the non-orientable case in which we get χ = (n − 2g + 1) − (m − g) + (ℓ − 1) = 2 − g. To decide whether two compact 2-manifolds are homeomorphic it suffices to determine whether they are both orientable or both nonorientable and, if they are, whether they have the same Euler characteristic. This can be done in time linear in the number of simplices in their triangulations. This result is in sharp contrast to the higher-dimensional case. The classification of compact 3-manifolds has been a longstanding open problem in Mathematics. Perhaps the recent proof of the Poincar´ conjecture by Perelman e brings us close to a resolution. Beyond three dimensions, the situation is hopeless, that is, deciding whether or not two triangulated compact manifolds of dimension four or higher are homeomorphic is undecidable.

a2

a1

a4

a2

b2 a2

b1

a3 a3

a2

Figure 85: The polygonal schema in standard form for the double torus and the double Klein bottle.

manifolds from the projective plane, P2 , by forming connected sums. Cutting up the g-fold projective plane gives a 2g-gon with edges identified in pairs as shown in Figure 85 on the right. We note that the constructions of the projective plane and the Klein bottle in Figure 84 are both not in standard form. A remarkable result which is now more than a century old is that every compact 2-manifold can be cut up to give a standard polygonal schema. This implies a classification of the possibilities.

67

19 Homology
In topology, the main focus is not on geometric size but rather on how a space is connected. The most elementary notion distinguishes whether we can go from one place to another. If not then there is a gap we cannot bridge. Next we would ask whether there is a loop going around an obstacle, or whether there is a void missing in the space. Homology is a formalization of these ideas. It gives a way to define and count holes using algebra. The cyclomatic number of a graph. To motivate the more general concepts, consider a connected graph, G, with n vertices and m edges. A spanning tree has n − 1 edges and every additional edge forms a unique cycle together with edges in this tree; see Figure 86. Every other

spanning tree while the cyclomatic number is independent of that choice. Simplicial complexes. We begin with a combinatorial representation of a topological space. Using a finite ground set of vertices, V , we call a subset σ ⊆ V an abstract simplex. Its dimension is one less than the cardinality, dim σ = |σ| − 1. A face is a subset τ ⊆ σ. D EFINITION . An abstract simplicial complex over V is a system K ⊆ 2V such that σ ∈ K and τ ⊆ σ implies τ ∈ K. The dimension of K is the largest dimension of any simplex in K. A graph is thus a 1-dimensional abstract simplicial complex. Just like for graphs, we sometimes think of K as an abstract structure and at other times as a geometric object consisting of geometric simplices. In the latter interpretation, we glue the simplices along shared faces to form a geometric realization of K, denoted as |K|. We say K triangulates a space X if there is a homeomorphism h : X → |K|. We have seen 1- and 2-dimensional examples in the preceding sections. The boundary of a simplex σ is the collection of co-dimension one faces, ∂σ = {τ ⊆ σ | dim τ = dim σ − 1}.

Figure 86: A tree with three additional edges defining the same number of cycles.

cycle in G can be written as a sum of these m − (n − 1) cycles. To make this concrete, we define a cycle as a subset of the edges such that every vertex belongs to an even number of these edges. A cycle does not need to be connected. The sum of two cycles is the symmetric difference of the two sets such that multiple edges erase each other in pairs. Clearly, the sum of two cycles is again a cycle. Every cycle, γ, in G contains some positive number of edges that do not belong to the spanning tree. Calling these edges e1 , e2 , . . . , ek and the cycles they define γ1 , γ2 , . . . , γk , we claim that γ = γ1 + γ2 + . . . + γk .

If dim σ = p then the boundary consists of p + 1 (p − 1)simplices. Every (p − 1)-simplex has p (p − 2)-simplices in its own boundary. This way we get (p + 1)p (p − 2)p+1 simplices, counting each of the p−1 = p+1 (p − 2)2 dimensional faces of σ twice. Chain complexes. We now generalize the cycles in graphs to cycles of different dimensions in simplicial complexes. A p-chain is a set of p-simplices in K. The sum of two p-chains is their symmetric difference. We usually write the sets as formal sums, c = a1 σ1 + a2 σ2 + . . . + an σn ; b1 σ1 + b2 σ2 + . . . + bn σn ,

d =

To see this assume that δ = γ1 + γ2 + . . . + γk is different from γ. Then γ +δ is again a cycle but it contains no edges that do not belong to the spanning tree. Hence γ + δ = ∅ and therefore γ = δ, as claimed. This implies that the m− n+ 1 cycles form a basis of the group of cycles which motivates us to call m − n + 1 the cyclomatic number of the graph. Note that the basis depends on the choice of

where the ai and bi are either 0 or 1. Addition can then be done using modulo 2 arithmetic, c +2 d = (a1 +2 b1 )σ1 + . . . + (an +2 bn )σn , where ai +2 bi is the exclusive or operation. We simplify notation by dropping the subscript but note that the two plus signs are different, one modulo two and the other a formal notation separating elements in a set. The p-chains

68

form a group, which we denote as (Cp , +) or simply Cp . Note that the boundary of a p-simplex is a (p − 1)-chain, an element of Cp−1 . Extending this concept linearly, we define the boundary of a p-chain as the sum of boundaries of its simplices, ∂c = a1 ∂σ1 +. . .+an ∂σn . The boundary is thus a map between chain groups and we sometimes write the dimension as index for clarity, ∂p : Cp → Cp−1 . It is a homomorphism since ∂p (c + d) = ∂p c + ∂p d. The infinite sequence of chain groups connected by boundary homomorphisms is called the chain complex of K. All groups of dimension smaller than 0 and larger than the dimension of K are trivial. It is convenient to keep them around to avoid special cases at the ends. A p-cycle is a p-chain whose boundary is zero. The sum of two p-cycles is again a p-cycle so we get a subgroup, Zp ⊆ Cp . A p-boundary is a p-chain that is the boundary of a (p + 1)chain. The sum of two p-boundaries is again a p-boundary so we get another subgroup, Bp ⊆ Cp , Taking the boundary twice in a row gives zero for every simplex and thus for every chain, that is, (∂p (∂p+1 d) = 0. It follows that Bp is a subgroup of Zp . We can therefore draw the chain complex as in Figure 87.
C p+1 Z p+1 B p+1 p+2 p+1

δ ε γ

Figure 88: The 1-cycles γ and δ are not 1-boundaries. Adding the 1-boundary ε to δ gives a 1-cycle homologous to δ.

as c ∼ c′ . Note that [c] = [c′ ] whenever c ∼ c′ . Also note that [c + d] = [c′ + d′ ] whenever c ∼ c′ and d ∼ d′ . We use this as a definition of addition for homology classes, so we again have a group. For example, the 1-st homology group of the torus consists of four elements, [0] = B1 , [γ] = γ + B1 , [δ] = δ + B1 , and [γ + δ] = γ + δ + B1 . We often draw the elements as the corners of a cube of some dimension; see Figure 89. If the dimension is β then it has
[γ] [γ+δ]

[0]

[γ]

Cp Zp Bp p C p−1 Z p−1 B p−1 p−1 Figure 89: The four homology classes of H1 are generated by two classes, [γ] and [δ].

0

0

0

Figure 87: The chain complex consisting of a linear sequence of chain, cycle, and boundary groups connected by homomorphisms.

2β corners. The dimension is also the number of classes needed to generate the group, the size of the basis. For the p-th homology group, this number is βp = rank Hp = log2 |Hp |, the p-th Betti number. For the torus we have β0 β1 β2 = = = 1; 2; 1,

Homology groups. We would like to talk about cycles but ignore the boundaries since they do not go around a hole. At the same time, we would like to consider two cycles the same if they differ by a boundary. See Figure 88 for a few 1-cycles, some of which are 1-boundaries and some of which are not. This is achieved by taking the quotient of the cycle group and the boundary group. The result is the p-th homology group, Hp = Zp /Bp .

and βp = 0 for all p = 0, 1, 2. Every 0-chain is a 0cycle. Two 0-cycles are homologous if they are both the sum of an even number or both of an odd number of vertices. Hence β0 = log2 2 = 1. We have seen the reason for β1 = 2 before. Finally, there are only two 2-cycles, namely 0 and the set of all triangles. The latter is not a boundary, hence β2 = log2 2 = 1. Boundary matrices. To compute homology groups and Betti numbers, we use a matrix representation of the simplicial complex. Specifically, we store the boundary homomorphism for each dimension, setting ∂p [i, j] = 1 if

Its elements are of the form [c] = c + Bp , where c is a pcycle. [c] is called a homology class, c is a representative of [c], and any two cycles in [c] are homologous denoted

69

the i-th (p − 1)-simplex is in the boundary of the j-th psimplex, and ∂p [i, j] = 0, otherwise. For example, if the complex consists of all faces of the tetrahedron, then the boundary matrices are ∂0 ∂1 = =  0 0 1  1   0 0  1  1   0   1   0 0  1  1   1 1 1 0 1 0 1 0 1 0 1 0  0 0 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 ; 0 1 0 1   0 0  ; 1  1

ing a linear system. We write it recursively, calling it with m = 1. void R EDUCE(m) if ∃k, l ≥ m with ∂p [k, l] = 1 then exchange rows m and k and columns m and l; for i = m + 1 to np−1 do if ∂p [i, m] = 1 then add row m to row i endif endfor; for j = m + 1 to np do if ∂p [m, j] = 1 then add column m to column j endif endfor; R EDUCE(m + 1) endif. For each recursive call, we have at most a linear number of row and column operations. The total running time is therefore at most cubic in the number of simplices. Figure 90 shows how we interpret the result. Specifically, the number of zero columns is the rank of the cycle group, Zp , and the number of 1s in the diagonal is the rank of the boundary group, Bp−1 . The Betti number is the difference, βp = rank Zp − rank Bp ,

∂2

=

   ;   

∂3

=

Given a p-chain as a column vector, v, its boundary is computed by matrix multiplication, ∂p v. The result is a combination of columns in the p-th boundary matrix, as specified by v. Thus, v is a p-cycle iff ∂p v = 0 and v is a p-boundary iff there is u such that ∂p+1 u = v. Matrix reduction. Letting np be the number of psimplices in K, we note that it is also the rank of the p-th chain group, np = rank Cp . The p-th boundary matrix thus has np−1 rows and np columns. To figure the sizes of the cycle and boundary groups, and thus of the homology groups, we reduce the matrix to normal form, as shown in Figure 90. The algorithm of choice uses column and rank C p rank Z p rank Bp −1 rank C p −1

 . 

taking the rank of the boundary group from the reduced matrix one dimension up. Working on our example, we get the following reduced matrices. ∂0 ∂1 =  0 0 0 0 1 0 0 0 1 0 0 0 0  0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ; 0 0 0 0   0 0  ; 0  0

∂2

∂3
Figure 90: The p-th boundary matrix in normal form. The entries in the shaded portion of the diagonal are 1 and all other entries are 0.

row operations similar to Gaussian elimination for solv-

Writing zp = rank Zp and bp = rank Bp , we get z0 = 4 from the zeroth and b0 = 3 from the first reduced boundary matrix. Hence β0 = z0 = b0 = 1. Furthermore,

1  0 =   0 0  1  0   0 =   0   0 0  1  0 =   0 0

   ;   

 . 

70

z1 = 3 and b1 = 3 giving β1 = 0, z2 = 1 and b2 = 1 giving β2 = 0, and z3 = 0 giving β3 = 0. These are the Betti numbers of the closed ball. Euler-Poincar´ Theorem. The Euler characteristic of a e simplicial complex is the alternating sum of simplex numbers, χ = p≥0 (−1)p np .

Recalling that np is the rank of the p-th chain group and that it equals the rank of the p-th cycle group plus the rank of the (p − 1)-st boundary group, we get χ = p≥0 (−1)p (zp + bp−1 ) (−1)p (zp − bp ),

= p≥0 which is the same as the alternating sum of Betti numbers. To appreciate the beauty of this result, we need to know that the Betti numbers do not depend on the triangulation chosen for the space. The proof of this property is technical and omitted. This now implies that the Euler characteristic is an invariant of the space, same as the Betti numbers. ´ E ULER -P OINCAR E T HEOREM . χ = (−1)p βp .

71

Fifth Homework Assignment
Write the solution to each problem on a single page. The deadline for handing in solutions is November 13. Problem 1. (20 points). Let G = (V, E) be a maximally connected planar graph and recall that [k] = {1, 2, . . . , k}. A vertex k-coloring is a mapping γ : V → [k] such that γ(u) = γ(v) whenever u = v are adjacent, and an edge k-coloring is a mapping η : E → [k] such that η(e) = η(f ) whenever e = f bound a common triangle. Prove that if G has a vertex 4-coloring then it also has an edge 3-coloring. Problem 2. (20 = 10 + 10 points). Let K be a set of triangles together with their edges and vertices. The vertices are represented by a linear array, as usual, but there is no particular ordering information in the way the edges and triangles are given. In other words, the edges are just a list of index pairs and the triangles are a list of index triplets into the vertex array. (a) Give an algorithm that decides whether or not K is a triangulation of a 2-manifold. (b) Analyze your algorithm and collect credit points if the running time of your algorithm is linear in the number of triangles. Problem 3. (20 = 5+7+8 points). Determine the type of 2-manifold with boundary obtained by the following constructions. (a) Remove a cylinder from a torus in such a way that the rest of the torus remains connected. (b) Remove a disk from the projective plane. (c) Remove a M¨ bius strip from a Klein bottle. o Whenever we remove a piece, we do this like cutting with scissors so that the remainder is still closed, in each case a 2-manifold with boundary. Problem 4. (20 = 5 + 5 + 5 + 5 points). Recall that the sphere is the space of points at unit distance from the origin in three-dimensional Euclidean space, S2 = {x ∈ R3 | x = 1}. (a) (b) (c) (d) Give a triangulation of S2 . Give the corresponding boundary matrices. Reduce the boundary matrices. Give the Betti numbers of S2 .

Problem 5. (20 = 10 + 10 points). The dunce cap is obtained by gluing the three edges of a triangular sheet of paper to each other. [After gluing the first two edges you get a cone, with the glued edges forming a seam connecting the cone point with the rim. In the final step, wrap the seam around the rim, gluing all three edges to each other. To imagine how this work, it might help to think of the final result as similar to the shell of a snale.] (a) Is the dunce cap a 2-manifold? Justify your answer. (b) Give a triangulation of the dunce cap, making sure that no two edges connect the same two vertices and no two triangles connect the same three vertices.

72

VI

G EOMETRIC A LGORITHMS

20 21 22

Plane-Sweep Delaunay Triangulations Alpha Shapes Sixth Homework Assignment

73

20 Plane-Sweep
Plane-sweep is an algorithmic paradigm that emerges in the study of two-dimensional geometric problems. The idea is to sweep the plane with a line and perform the computations in the sequence the data is encountered. In this section, we solve three problems with this paradigm: we construct the convex hull of a set of points, we triangulate the convex hull using the points as vertices, and we test a set of line segments for crossings.

The algorithm is illustrated in Figure 92, which shows the addition of the sixth point in the data set.

2 5 7 1 4 3 6

9

8

Convex hull. Let S be a finite set of points in the plane, each given by its two coordinates. The convex hull of S, denoted by conv S, is the smallest convex set that contains S. Figure 91 illustrates the definition for a set of nine points. Imagine the points as solid nails in a planar board. An intuitive construction stretches a rubber band around the nails. After letting go, the nails prevent the complete relaxation of the rubber band which will then trace the boundary of the convex hull.
2 5 7 1 4 3 6 8 9

Figure 92: The vertical sweep-line passes through point 6. To add 6, we substitute 6 for the sequence of vertices on the boundary between 3 and 5.

Figure 91: The convex hull of nine points, which we represent by the counterclockwise sequence of boundary vertices: 1, 3, 6, 8, 9, 2.

The three points form a right-turn iff the determinant is negative and they lie on a common line iff the determinant is zero. boolean L EFT(Points a, b, c) return [a1 (b2 − c2 ) + b1 (c2 − a2 ) + c1 (a2 − b2 ) > 0]. To see that this formula is correct, we may convince ourselves that it is correct for three non-collinear points, e.g. a = (0, 0), b = (1, 0), and c = (0, 1). Remember also that the determinant measures the area of the triangle and is therefore a continuous function that passes through zero only when the three points are collinear. Since we can continuously move every left-turn to every other left-turn without leaving the class of left-turns, it follows that the sign of the determinant is the same for all of them. Finding support lines. We use a doubly-linked cyclic list of vertices to represent the convex hull boundary. Each

Orientation test. A critical test needed to construct the convex hull is to determine the orientation of a sequence of three points. In other words, we need to be able to distinguish whether we make a left-turn or a right-turn as we go from the first to the middle and then the last point in the sequence. A convenient way to determine the orientation evaluates the determinant of a three-by-three matrix. More precisely, the points a = (a1 , a2 ), b = (b1 , b2 ), c = (c1 , c2 ) form a left-turn iff   1 a1 a2 det  1 b1 b2  > 0. 1 c1 c2

To construct the counterclockwise cyclic sequence of boundary vertices representing the convex hull, we sweep a vertical line from left to right over the data. At any moment in time, the points in front (to the right) of the line are untouched and the points behind (to the left) of the line have already been processed. Step 1. Sort the points from left to right and relabel them in this sequence as x1 , x2 , . . . , xn . Step 2. Construct a counterclockwise triangle from the first three points: x1 x2 x3 or x1 x3 x2 . Step 3. For i from 4 to n, add the next point xi to the convex hull of the preceding points by finding the two lines that pass through xi and support the convex hull.

74

node in the list contains pointers to the next and the previous nodes. In addition, we have a pointer last to the last vertex added to the list. This vertex is also the rightmost in the list. We add the i-th point by connecting it to the vertices µ → pt and λ → pt identified in a counterclockwise and a clockwise traversal of the cycle starting at last , as illustrated in Figure 93. We simplify notation by using µ maximally connected straight-line embedding of a planar graph whose vertices are mapped to points in S. Figure 94 shows the triangulation of the nine points in Figure 91 constructed by the plane-sweep algorithm. A triangulation is
2 5 7 1 4 3 6 8 9

last

ν

Figure 94: Triangulation constructed with the plane-sweep algorithm. λ Figure 93: The upper support line passes through the first point µ → pt that forms a left-turn from ν → pt to µ → next → pt .

nodes in the parameter list of the orientation test instead of the points they store. µ = λ = last ; create new node with ν → pt = i; while R IGHT (ν, µ, µ → next) do µ = µ → next endwhile; while L EFT(ν, λ, λ → prev ) do λ = λ → prev endwhile; ν → next = µ; ν → prev = λ; µ → prev = λ → next = ν; last = ν. The effort to add the i-th point can be large, but if it is then we remove many previously added vertices from the list. Indeed, each iteration of the for-loop adds only one vertex to the cyclic list. We charge $2 for the addition, one dollar for the cost of adding and the other to pay for the future deletion, if any. The extra dollars pay for all iterations of the while-loops, except for the first and the last. This implies that we spend only constant amortized time per point. After sorting the points from left to right, we can therefore construct the convex hull of n points in time O(n). Triangulation. The same plane-sweep algorithm can be used to decompose the convex hull into triangles. All we need to change is that points and edges are never removed and a new point is connected to every point examined during the two while-loops. We define a (geometric) triangulation of a finite set of points S in the plane as a

not necessarily a maximally connected planar graph since the prescribed placement of the points fixes the boundary of the outer face to be the boundary of the convex hull. Letting k be the number of edges of that boundary, we would have to add k − 3 more edges to get a maximally connected planar graph. It follows that the triangulation has m = 3n − (k + 3) edges and ℓ = 2n − (k + 2) triangles. Line segment intersection. As a third application of the plane-sweep paradigm, we consider the problem of deciding whether or not n given line segments have pairwise disjoint interiors. We allow line segments to share endpoints but we do not allow them to cross or to overlap. We may interpret this problem as deciding whether or not a straight-line drawing of a graph is an embedding. To simplify the description of the algorithm, we assume no three endpoints are collinear, so we only have to worry about crossings and not about other overlaps. How can we decide whether or not a line segment with endpoint u = (u1 , u2 ) and v = (v1 , v2 ) crosses another line segment with endpoints p = (p1 , p2 ) and q = (q1 , q2 )? Figure 95 illustrates the question by showing the four different cases of how two line segments and the lines they span can intersect. The line segments cross iff uv intersects the line of pq and pq intersects the line of uv. This condition can be checked using the orientation test. boolean C ROSS (Points u, v, p, q) return [(L EFT(u, v, p) xor L EFT(u, v, q)) and (L EFT(p, q, u) xor L EFT(p, q, v))]. We can use the above function to test all segments, which takes time O(n2 ). n 2

pairs of line

75

u v p q

u v q p q

u p u v q v p

Case 2.2 xi is right endpoint of the line segment xi xj . Therefore, i > j. Let uv and pq be the predecessor and the successor of xi xj . If C ROSS (u, v, p, q) then report the crossing and stop. Delete xi xj from the dictionary. We do an insertion into the dictionary for each left endpoint and a deletion from the dictionary for each right endpoint, both in time O(log n). In addition, we do at most two crossing tests per endpoint, which takes constant time. In total, the algorithm takes time O(n log n) to test whether a set of n line segments contains two that cross.

Figure 95: Three pairs of non-crossing and one pair of crossing line segments.

Plane-sweep algorithm. We obtain a faster algorithm by sweeping the plane with a vertical line from left to right, as before. To avoid special cases, we assume that no two endpoints are the same or lie on a common vertical line. During the sweep, we maintain the subset of line segments that intersect the sweep-line in the order they meet the line, as shown in Figure 96. We store this subset

Figure 96: Five of the line segments intersect the sweep-line at its current position and two of them cross.

in a dictionary, which is updated at every endpoint. Only line segments that are adjacent in the ordering along the sweep-line are tested for crossings. Indeed, two line segments that cross are adjacent right before the sweep-line passes through the crossing, if not earlier. Step 1. Sort the 2n endpoints from left to right and relabel them in this sequence as x1 , x2 , . . . , x2n . Each point still remembers the index of the other endpoint of its line segment. Step 2. For i from 1 to 2n, process the i-th endpoint as follows: Case 2.1 xi is left endpoint of the line segment xi xj . Therefore, i < j. Insert xi xj into the dictionary and let uv and pq be its predecessor and successor. If C ROSS (u, v, xi , xj ) or C ROSS (p, q, xi , xj ) then report the crossing and stop.

76

21 Delaunay Triangulations
The triangulations constructing by plane-sweep are typically of inferior quality, that is, there are many long and skinny triangles and therefore many small and many large angles. We study Delaunay triangulations which distinguish themselves from all other triangulations by a number of nice properties, including they have fast algorithms and they avoid small angles to the extent possible.

Voronoi diagram. We introduce the Delaunay triangulation indirectly, by first defining a particular decomposition of the plane into regions, one per point in the finite data set S. The region of the point u in S contains all points x in the plane that are at least as close to u as to any other point in S, that is, Vu = {x ∈ R2 | x − u ≤ x − v , v ∈ S},

Plane-sweep versus Delaunay triangulation. Figures 97 and 98 show two triangulations of the same set of points, one constructed by plane-sweep and the other the Delaunay triangulation. The angles in the Delaunay trian-

where x − u = [(x1 − u1 )2 + (x2 − u2 )2 ]1/2 is the Euclidean distance between the points x and u. We refer to Vu as the Voronoi region of u. It is closed and its boundary consists of Voronoi edges which Vu shares with neighboring Voronoi regions. A Voronoi edge ends in Voronoi vertices which it shares with other Voronoi edges. The Voronoi diagram of S is the collection of Voronoi regions, edges and vertices. Figure 99 illustrates the definitions. Let n be the number of points in S. We list some of the properties that will be important later.

Figure 97: Triangulation constructed by plane-sweep. Points on the same vertical line are processed from bottom to top.

gulation seem consistently larger than those in the planesweep triangulation. This is not a coincidence and it can be proved that the Delaunay triangulation maximizes the minimum angle for every input set. Both triangulations

Figure 99: The (solid) Voronoi diagram drawn above the (dotted) Delaunay triangulation of the same twenty-one points triangulated in Figures 97 and 98. Some of the Voronoi edges are too far out to fit into the picture.

• The Voronoi region Vu is bounded (finite) iff u lies in the interior of the convex hull of S. • The Voronoi regions have pairwise disjoint interiors and together cover the entire plane. Delaunay triangulation. We define the Delaunay triangulation as the straight-line dual of the Voronoi diagram. Specifically, for every pair of Voronoi regions Vu and Vv that share an edge, we draw the line segment from u to v. By construction, every Voronoi vertex, x, has j ≥ 3 closest input points. Usually there are exactly three closest

• Each Voronoi region is a convex polygon constructed as the intersection of n − 1 closed half-planes.

Figure 98: Delaunay triangulation of the same twenty-one points triangulated in Figure 97.

contain the edges that bound the convex hull of the input set.

77

points, u, v, w, in which case the triangle they span belongs to the Delaunay triangulation. Note that x is equally far from u, v, and w and further from all other points in S. This implies the empty circle property of Delaunay triangles: all points of S − {u, v, w} lie outside the circumscribed circle of uvw. Similarly, for each Delaunay edge uv, there is a circle that passes through u and v such that all points of S − {u, v} lie outside the circle. For example, the circle centered at the midpoint of the Voronoi edge shared by Vu and Vv is empty in this sense. This property can be used to prove that the edge skeleton of the Delaunay triangulation is a straight-line embedding of a planar graph.

cally Delaunay if it bounds the convex hull of S and thus belongs to only one triangle. The local condition on the edges implies a global property. D ELAUNAY L EMMA . If every edge in a triangulation K of S is locally Delaunay then K is the Delaunay triangulation of S. Although every edge of the Delaunay triangulation is locally Delaunay, the Delaunay Lemma is not trivial. Indeed, K may contain edges that are locally Delaunay but do not belong to the Delaunay triangulation, as shown in Figure 101. We omit the proof of the lemma.

u

v

Figure 100: A Voronoi vertex of degree 5 and the corresponding pentagon in the Delaunay triangulation. The dotted edges complete the triangulation by decomposing the pentagon into three triangles.

Figure 101: The edge uv is locally Delaunay but does not belong to the Delaunay triangulation.

Now suppose there is a vertex with degree j > 3. It corresponds to a polygon with j > 3 edges in the Delaunay triangulation, as illustrated in Figure 100. Strictly speaking, the Delaunay triangulation is no longer a triangulation but we can complete it to a triangulation by decomposing each j-gon into j − 2 triangles. This corresponds to perturbing the data points every so slightly such that the degree-j Voronoi vertices are resolved into trees in which j − 2 degree-3 vertices are connected by j − 3 tiny edges. Local Delaunayhood. Given a triangulation of a finite point set S, we can test whether or not it is the Delaunay triangulation by testing each edge against the two triangles that share the edge. Suppose the edge uv in the triangulation T is shared by the triangles uvp and uvq. We call uv locally Delaunay, or lD for short, if q lies on or outside the circle that passes through u, v, p. The condition is symmetric in p and q because the circle that passes through u, v, p intersects the first circle in points u and v. It follows that p lies on or outside the circle of u, v, q iff q lies on or outside the circle of u, v, p. We also call uv lo-

Edge-flipping. The Delaunay Lemma suggests we construct the Delaunay triangulation by first constructing an arbitrary triangulation of the point set S and then modifying it locally to make all edges lD. The idea is to look for non-lD edges and to flip them, as illustrated in Figure 102. Indeed, if uv is a non-lD edge shared by triangles uvp and

q

u

v

p

Figure 102: The edge uv is non-lD and can be flipped to the edge pq, which is lD.

uvq then upvq is a convex quadrilateral and flipping uv means substituting one diagonal for the other, namely pq

78

for uv. Note that if uv is non-lD then pq is lD. It is important that the algorithm finds non-lD edges quickly. For this purpose, we use a stack of edges. Initially, we push all edges on the stack and mark them. while stack is non-empty do pop edge uv from stack and unmark it; if uv is non-lD then substitute pq for uv; for ab ∈ {up, pv, vq, qu} do if ab is unmarked then push ab on the stack and mark it endif endfor endif endwhile. The marks avoid multiple copies of the same edge on the stack. This implies that at any one moment the size of the stack is less than 3n. Note also that initially the stack contains all non-lD edges and that this property is maintained as an invariant of the algorithm. The Delaunay Lemma implies that when the algorithm halts, which is when the stack is empty, then the triangulation is the Delaunay triangulation. However, it is not yet clear that the algorithm terminates. Indeed, the stack can grow and shrink during the course of the algorithm, which makes it difficult to prove that it ever runs empty. In-circle test. Before studying the termination of the algorithm, we look into the question of distinguishing lD from non-lD edges. As before we assume that the edge uv is shared by the triangles uvp and uvq in the current triangulation. Recall that uv is lD iff q lies outside the circle that passes through u, v, p. Let f : R2 → R be defined by f (x) = x2 + x2 . As illustrated in Figure 103, the graph 1 2 of this function is a paraboloid in three-dimensional space and we write x+ = (x1 , x2 , f (x)) for the vertical projection of the point x onto the paraboloid. Assuming the three points u, v, p do not lie on a common line then the points u+ , v + , p+ lie on a non-vertical plane that is the graph of a function h(x) = αx1 + βx2 + γ. The projection of the intersection of the paraboloid and the plane back into R2 is given by 0 = = f (x) − h(x) x2 + x2 − αx1 − βx2 − γ, 1 2

v q u p

Figure 103: The plane passing through u+ , v + , p+ intersects the paraboloid in an ellipse whose projection into R2 passes through the points u, v, p. The point q + lies below the plane iff q lies inside the circle.

against. We note that q lies inside the circle iff q + lies below the plane. The latter test can be based on the sign of the determinant of the 4-by-4 matrix   1 u1 u2 u2 + u2 1 2 2 2  1 v1 v2 v1 + v2   ∆ =  2 2  1 p1 p2 p1 + p2  . 2 2 1 q1 q2 q1 + q2

Exchanging two rows in the matrix changes the sign. While the in-circle test should be insensitive to the order of the first three points, the sign of the determinant is not. We correct the change using the sign of the determinant of the 3-by-3 matrix that keeps track of the ordering of u, v, p along the circle,   1 u1 u2 Γ =  1 v1 v2  . 1 p1 p2

Now we claim that s is inside the circle of u, v, p iff the two determinants have opposite signs: boolean IN C IRCLE (Points u, v, p, q) return det Γ · det ∆ < 0.

which is the equation of a circle. This circle passes through u, v, p so it is the circle we have to compare q

We first show that the boolean function is correct for u = (0, 0), v = (1, 0), p = (0, 1), and q = (0, 0.5). The sign of the product of determinants remains unchanged if we continuously move the points and avoid the configurations that make either determinant zero, which are when u, v, p are collinear and when u, v, p, q are cocircular. We can change any configuration where q is inside the circle of u, v, p continuously into the special configuration without going through zero, which implies the correctness of the function for general input points.

79

Termination and running time. To prove the edge-flip algorithm terminates, we imagine the triangulation lifted to R3 . We do this by projecting the vertices vertically onto the paraboloid, as before, and connecting them with straight edges and triangles in space. Let uv be an edge shared by triangles uvp and uvq that is flipped to pq by the algorithm. It follows the line segments uv and pq cross and their endpoints form a convex quadrilateral, as shown in Figure 104. After lifting the two line segments, we get

v q u p

Figure 104: A flip in the plane lifts to a tetrahedron in space in which the lD edge passes below the non-lD edge.

u+ v + passing above p+ q + . We may thus think of the flip as gluing the tetrahedron u+ v + p+ q + underneath the surface obtained by lifting the triangulation. The surface is pushed down by each flip and never pushed back up. The removed edge is now above the new surface and can therefore not be reintroduced by a later flip. It follows that the algorithm performs at most n flips and thus takes at most 2 time O(n2 ) to construct the Delaunay triangulation of S. There are faster algorithms that work in time O(n log n) but we prefer the suboptimal method because it is simpler and it reveals more about Delaunay triangulations than the other algorithms. The lifting of the input points to R3 leads to an interesting interpretation of the edge-flip algorithm. Starting with a monotone triangulated surface passing through the lifted points, we glue tetrahedra below the surface until we reach the unique convex surface that passes through the points. The projection of this convex surface is the Delaunay triangulation of the points in the plane. This also gives a reinterpretation of the Delaunay Lemma in terms of convex and concave edges of the surface.

80

22 Alpha Shapes
Many practical applications of geometry have to do with the intuitive but vague concept of the shape of a finite point set. To make this idea concrete, we use the distances between the points to identify subcomplexes of the Delaunay triangulation that represent that shape at different levels of resolution. Union of disks. Let S be a set of n points in R2 . For each r ≥ 0, we write Bu (r) = {x ∈ R2 | x − u ≤ r} for the closed disk with center u and radius r. Let U(r) = u∈S Bu (r) be the union of the n disks. We decompose this union into convex sets of the form Ru (r) = Bu (r) ∩ Vu . Then (i) Ru (r) is closed and convex for every point u ∈ S and every radius r ≥ 0;

plex as the dual of the Voronoi decomposition of the union of disks. This time around, we do this more formally. Letting C be a finite collection of sets, the nerve of C is the system of subcollections that have a non-empty common intersection, Nrv C = {X ⊆ C | X = ∅}.

(ii) Ru (r) and Rv (r) have disjoint interiors whenever the two points, u and v, are different; (iii) U(r) = u∈S Ru (r).

This is an abstract simplicial complex since X = ∅ and Y ⊆ X implies Y = ∅. For example, if C is the collection of Voronoi regions then Nrv C is an abstract version of the Delaunay triangulation. More specifically, this is true provide the points are in general position and in particular no four points lie on a common circle. We will assume this for the remainder of this section. We say the Delaunay triangulation is a geometric realization of Nrv C, namely the one obtained by mapping each Voronoi region (a vertex in the abstract simplicial complex) to the generating point. All edges and triangles are just convex hulls of their incident vertices. To go from the Delaunay triangulation to the alpha complex, we substitute the regions Ru (r) for the Vu . Specifically, Alpha(r) = Nrv {Ru (r) | u ∈ S}.

We illustrate this decomposition in Figure 105. Each region Ru (r) is the intersection of n − 1 closed half-planes and a closed disk. All these sets are closed and convex, which implies (i). The Voronoi regions have disjoint interiors, which implies (ii). Finally, take a point x ∈ U(r) and let u be a point in S with x ∈ Vu . Then x ∈ Bu (r) and therefore x ∈ Ru (x). This implies (iii).

Clearly, this is isomorphic to a subcomplex of the nerve of Voronoi regions. We can therefore draw Alpha(r) as a subcomplex of the Delaunay triangulation; see Figure 105. We call this geometric realization of Alpha(r) the alpha complex for radius r, denoted as A(r). The alpha shape for the same radius is the underlying space of the alpha complex, |A(r)|. The nerve preserves the way the union is connected. In particular, their Betti numbers are the same, that is, βp (U(r)) = βp (A(r)) for all dimensions p and all radii r. This implies that the union and the alpha shape have the same number of components and the same number of holes. For example, in Figure 105 both have one component and two holes. We omit the proof of this property.

Figure 105: The Voronoi decomposition of a union of eight disks in the plane and superimposed dual alpha complex.

Filtration. We are interested in the sequence of alpha shapes as the radius grows from zero to infinity. Since growing r grows the regions Ru (r), the nerve can only get bigger. In other words, A(r) ⊆ A(s) whenever r ≤ s. There are only finitely many subcomplexes of the Delaunay triangulation. Hence, we get a finite sequence of alpha complexes. Writing Ai for the i-th alpha complex, we get the following nested sequence, S = A1 ⊂ A2 ⊂ . . . ⊂ Ak = D,

Nerve. Similar to defining the Delaunay triangulation as the dual of the Voronoi diagram, we define the alpha com-

81

where D denotes the Delaunay triangulation of S. We call such a sequence of complexes a filtration. We illustrate this construction in Figure 106. The sequence of al-

At this moment, we have a triangulated disk but not yet the entire Delaunay triangulation since the triangle bcd and the edge bd are still missing. Each step is generic except when we add two equally long edges to A3 . Compatible ordering of simplices. We can represent the entire filtration of alpha complexes compactly by sorting the simplices in the order they join the growing complex. An ordering σ1 , σ2 , . . . , σm of the Delaunay simplices is compatible with the filtration if

b d c a

e h g f

1. the simplices in Ai precede the ones not in Ai for each i; 2. the faces of a simplex precede the simplex. For example, the sequence

Figure 106: A finite sequence of unions of disks, all decomposed by the same Voronoi diagram.

a, b, c, d, e, f, g, h; ah; bc; ab, ef ; de; gh; cd; fg; cg; cf ; bh, abh; ce, cde; cfg; cef ; ch, bch; cgh; bd; bcd is compatible with the filtration in Figure 106. Every alpha complex is a prefix of the compatible sequence but not necessarily the other way round. Condition 2 guarantees that every prefix is a complex, whether an alpha complex or not. We thus get a finer filtration of complexes ∅ = K0 ⊂ K1 ⊂ . . . ⊂ Km = D, where Ki is the set of simplices from σ1 to σi . To construct the compatible ordering, we just need to compute for each Delaunay simplex the radius ri = r(σi ) such that σi ∈ A(r) iff r ≥ ri . For a vertex, this radius is zero. For a triangle, this is the radius of the circumcircle. For

pha complexes begins with a set of n isolated vertices, the points in S. To go from one complex to the next, we either add an edge, we add a triangle, or we add a pair consisting of a triangle with one of its edges. In Figure 106, we begin with eight vertices and get the following sequence of alpha complexes. A1 A2 A3 A4 A5 A6 A7 A8 A9 = = = = = = = = = {a, b, c, d, e, f, g, h}; A1 ∪ {ah}; A2 ∪ {bc}; A3 ∪ {ab, ef }; A4 ∪ {de};

A7 ∪ {fg}; A8 ∪ {cg}.

A5 ∪ {gh}; A6 ∪ {cd};

Going from A7 to A8 , we get for the first time a 1-cycle, which bounds a hole in the embedding. In A9 , this hole is cut into two. This is the alpha complex depicted in Figure 105. We continue. A10 A11 A12 A13 A14 A15 A16 = A9 ∪ {cf }; = A10 ∪ {abh, bh}; = A11 ∪ {cde, ce}; = A12 ∪ {cfg}; = A13 ∪ {cef };

ϕ

ψ

ϕ ψ

Figure 107: Left: the middle edge belongs to two acute triangles. Right: it belongs to an obtuse and an acute triangle.

an edge, we have two cases. Let ϕ and ψ be the angles opposite the edge σi inside the two incident triangles. We have ϕ + ψ > 180◦ because of the empty circle property. C ASE 1. ϕ < 90◦ and ψ < 90◦ . Then ri = r(σi ) is half the length of the edge.

= A14 ∪ {bch, ch}; = A15 ∪ {cgh}.

82

C ASE 2. ϕ ≥ 90◦ . Then ri = rj , where σj is the incident triangle with angle ϕ. Both cases are illustrated in Figure 107. In Case 2, the edge σi enters the growing alpha complex together with the triangle σj . The total number of simplices in the Delaunay triangulation is m < 6n. The threshold radii can be computed in time O(n). Sorting the simplices into the compatible ordering can therefore be done in time O(n log n). Betti numbers. In two dimensions, Betti numbers can be computed directly, without resorting to boundary matrices. The only two possibly non-zero Betti numbers are β0 , the number of components, and β1 , the number of holes. We compute the Betti numbers of Kj by adding the simplices in order. β0 = β1 = 0; for i = 1 to j do switch dim σi : case 0: β0 = β0 + 1; case 1: let u, v be the endpoints of σi ; if F IND (u) = F IND(v) then β1 = β1 + 1 else β0 = β0 − 1; U NION(u, v) endif case 2: β1 = β1 − 1 endswitch endfor. All we need is tell apart the two cases when σi is an edge. This is done using a union-find data structure maintaining the components of the alpha complex in amortized time α(n) per simplex. The total running time of the algorithm for computing Betti numbers is therefore O(nα(n)).

83

Sixth Homework Assignment
Write the solution to each problem on a single page. The deadline for handing in solutions is November 25. Problem 1. (20 points). Let S be a set of n unit disks in the Euclidean plane, each given by its center and radius, which is one. Give an algorithm that decides whether any two of the disks in S intersect. Problem 2. (20 = 10 + 10 points). Let S be a set of n points in the Euclidean plane. The Gabriel graph connects points u, v ∈ S with a straight edge if u−v
2



u−p

2

+ v−p

2

for every point p in S. (a) Show that the Grabriel graph is a subgraph of the edge skeleton of the Delaunay triangulation. (b) Is the Gabriel graph necessarily connected? Justify your answer. Problem 3. (20 = 10 + 10 points). Consider a set of n ≥ 3 closed disks in the Euclidean plane. The disks are allowed to touch but no two of them have an interior point in common. (a) Show that the number of touching pairs of disks is at most 3n − 6. (b) Give a construction that achieves the upper bound in (a) for any n ≥ 3. Problem 4. (20 = 10 + 10 points). Let K be a triangulation of a set of n ≥ 3 points in the plane. Let L be a line that avoids all the points. (a) Prove that L intersects at most 2n − 4 of the edges in K. (b) Give a construction for which L achieves the upper bound in (a) for any n ≥ 3. Problem 5. (20 points). Let S be a set of n points in the Euclidean plane, consider its Delaunay triangulation and the corresponding filtration of alpha complexes, S = A1 ⊂ A2 ⊂ . . . ⊂ Ak . Under what conditions is it true that Ai and Ai+1 differ by a single simplex for every 1 ≤ i ≤ m − 1?

84

VII

NP-C OMPLETENESS

23 24 25

Easy and Hard Problems NP-Complete Problems Approximation Algorithms Seventh Homework Assignment

85

23 Easy and Hard Problems
The theory of NP-completeness is an attempt to draw a line between tractable and intractable problems. The most important question is whether there is indeed a difference between the two, and this question is still unanswered. Typical results are therefore relative statements such as “if problem B has a polynomial-time algorithm then so does problem C” and its equivalent contra-positive “if problem C has no polynomial-time algorithm then neither has problem B”. The second formulation suggests we remember hard problems C and for a new problem B we first see whether we can prove the implication. If we can then we may not want to even try to solve problem B efficiently. A good deal of formalism is necessary for a proper description of results of this kind, of which we will introduce only a modest amount. What is a problem? An abstract decision problem is a function I → {0, 1}, where I is the set of problem instances and 0 and 1 are interpreted to mean FALSE and TRUE , as usual. To completely formalize the notion, we encode the problem instances in strings of zeros and ones: I → {0, 1}∗ . A concrete decision problem is then a function Q : {0, 1}∗ → {0, 1}. Following the usual convention, we map bit-strings that do not correspond to meaningful problem instances to 0. As an example consider the shortest-path problem. A problem instance is a graph and a pair of vertices, u and v, in the graph. A solution is a shortest path from u and v, or the length of such a path. The decision problem version specifies an integer k and asks whether or not there exists a path from u to v whose length is at most k. The theory of NP-completeness really only deals with decision problems. Although this is a loss of generality, the loss is not dramatic. For example, given an algorithm for the decision version of the shortest-path problem, we can determine the length of the shortest path by repeated decisions for different values of k. Decision problems are always easier (or at least not harder) than the corresponding optimization problems. So in order to prove that an optimization problem is hard it suffices to prove that the corresponding decision problem is hard. Polynomial time. An algorithm solves a concrete decision problem Q in time T (n) if for every instance x ∈ {0, 1}∗ of length n the algorithm produces Q(x) in time at most T (n). Note that this is the worst-case notion of time-complexity. The problem Q is polynomial-time solv-

able if T (n) = O(nk ) for some constant k independent of n. The first important complexity class of problems is P = set of concrete decision problems that are polynomial-time solvable.

The problems Q ∈ P are called tractable or easy and the problems Q ∈ P are called intractable or hard. Algorithms that take only polynomial time are called efficient and algorithms that require more than polynomial time are inefficient. In other words, until now in this course we only talked about efficient algorithms and about easy problems. This terminology is adapted because the rather fine grained classification of algorithms by complexity we practiced until now is not very useful in gaining insights into the rather coarse distinction between polynomial and non-polynomial. It is convenient to recast the scenario in a formal language framework. A language is a set L ⊆ {0, 1}∗. We can think of it as the set of problem instances, x, that have an affirmative answer, Q(x) = 1. An algorithm A : {0, 1}∗ → {0, 1} accepts x ∈ {0, 1}∗ if A(x) = 1 and it rejects x if A(x) = 0. The language accepted by A is the set of strings x ∈ {0, 1}∗ with A(x) = 1. There is a subtle difference between accepting and deciding a language L. The latter means that A accepts every x ∈ L and rejects every x ∈ L. For example, there is an algorithm that accepts every program that halts, but there is no algorithm that decides the language of such programs. Within the formal language framework we redefine the class of polynomial-time solvable problems as P = {L ⊆ {0, 1}∗ | L is accepted by a polynomial-time algorithm} = {L ⊆ {0, 1}∗ | L is decided by a polynomial-time algorithm}.

Indeed, a language that can be accepted in polynomial time can also be decided in polynomial time: we keep track of the time and if too much goes by without x being accepted, we turn around and reject x. This is a nonconstructive argument since we may not know the constants in the polynomial. However, we know such constants exist which suffices to show that a simulation as sketched exists. Hamiltonian cycles. We use a specific graph problem to introduce the notion of verifying a solution to a problem, as opposed to solving it. Let G = (V, E) be an undirected graph. A hamiltonian cycle contains every vertex

86

v ∈ V exactly once. The graph G is hamiltonian if it has a hamiltonian cycle. Figure 108 shows a hamiltonian cycle of the edge graph of a Platonic solid. How about the edge graphs of the other four Platonic solids? Define L =

The name NP is an abbreviation for non-deterministic polynomial time, because a non-deterministic computer can guess a certificate and then verify that certificate. In a parallel emulation, the computer would generate all possible certificates and then verify them in parallel. Generating one certificate is easy, because it only has polynomial length, but generating all of them is hard, because there are exponentially many strings of polynomial length.

P = NP = co−NP

P

NP = co−NP

Figure 108: The edge graph of the dodecahedron and one of its hamiltonian cycles.

NP

P= NP co−NP

NP

P

co−NP

{G | G is hamiltonian}. We can thus ask whether or not L ∈ P, that is, whether or not there is a polynomial-time algorithm that decides whether or not a graph is hamiltonian. The answer to this question is currently not known, but there is evidence that the answer might be negative. On the other hand, suppose y is a hamiltonian cycle of G. The language L′ = {(G, y) | y is a hamiltonian cycle of G} is certainly in P because we just need to make sure that y and G have the same number of vertices and every edge of y is also an edge of G. Non-deterministic polynomial time. More generally, it seems easier to verify a given solution than to come up with one. In a nutshell, this is what NP-completeness is about, namely finding out whether this is indeed the case and whether the difference between accepting and verifying can be used to separate hard from easy problems. Call y ∈ {0, 1}∗ a certificate. An algorithm A verifies a problem instance x ∈ {0, 1}∗ if there exists a certificate y with A(x, y) = 1. The language verified by A is the set of strings x ∈ {0, 1}∗ verified by A. We now define a new class of problems, NP = {L ⊆ {0, 1}∗ | L is verified by a polynomial-time algorithm}.

Figure 109: Four possible relations between the complexity classes P, NP, and co-NP.

Non-deterministic machine are at least as powerful as deterministic machines. It follows that every problem in P is also in NP, P ⊆ NP. Define co-NP = {L | L = {x ∈ L} ∈ NP},

which is the class of languages whose complement can be verified in non-deterministic polynomial time. It is not known whether or not NP = co-NP. For example, it seems easy to verify that a graph is hamiltonian but it seems hard to verify that a graph is not hamiltonian. We said earlier that if L ∈ P then L ∈ P. Therefore, P ⊆ co-NP. Hence, only the four relationships between the three complexity classes shown in Figure 109 are possible, but at this time we do not know which one is correct. Problem reduction. We now develop the concept of reducing one problem to another, which is key in the construction of the class of NP-complete problems. The idea is to map or transform an instance of a first problem to an instance of a second problem and to map the solution to the second problem back to a solution to the first problem. For decision problems, the solutions are the same and need no transformation. Language L1 is polynomial-time reducible to language L2 , denoted L1 ≤P L2 , if there is a polynomial-time computable function f : {0, 1}∗ → {0, 1}∗ such that x ∈ L1 iff f (x) ∈ L2 , for all x ∈ {0, 1}∗. Now suppose that

More formally, L is in NP if for every problem instance x ∈ L there is a certificate y whose length is bounded from above by a polynomial in the length of x such that A(x, y) = 1 and A runs in polynomial time. For example, deciding whether or not G is hamiltonian is in NP.

87

L1 is polynomial-time reducible to L2 and that L2 has a polynomial-time algorithm A2 that decides L2 ,
2 x −→ f (x) −→ {0, 1}.

f

A

In fact, all truth assignments evaluate to 1, which means that ψ is really a tautology. More generally, a boolean formula, ϕ, is satisfyable iff ¬ϕ is not a tautology. S ATISFIABILITY T HEOREM . We have SAT ∈ NP and L′ ≤P SAT for every L′ ∈ NP. That SAT is in the class NP is easy to prove: just guess an assignment and verify that it satisfies. However, to prove that every L′ ∈ NP can be reduced to SAT in polynomial time is quite technical and we omit the proof. The main idea is to use the polynomial-time algorithm that verifies L′ and to construct a boolean formula from this algorithm. To formalize this idea, we would need a formal model of a computer, a Touring machine, which is beyond the scope of this course.

We can compose the two algorithms and obtain a polynomial-time algorithm A1 = A2 ◦ f that decides L1 . In other words, we gained an efficient algorithm for L1 just by reducing it to L2 . R EDUCTION L EMMA . If L1 ≤P L2 and L2 ∈ P then L1 ∈ P. In words, if L1 is polynomial-time reducible to L2 and L2 is easy then L1 is also easy. Conversely, if we know that L1 is hard then we can conclude that L2 is also hard. This motivates the following definition. A language L ⊆ {0, 1}∗ is NP-complete if (2) L′ ≤P L, for every L′ ∈ NP. Since every L′ ∈ NP is polynomial-time reducible to L, all L′ have to be easy for L to have a chance to be easy. The L′ thus only provide evidence that L might indeed be hard. We say L is NP-hard if it satisfies (2) but not necessarily (1). The problems that satisfy (1) and (2) form the complexity class NPC = {L | L is NP-complete}. (1) L ∈ NP;

All these definitions would not mean much if we could not find any problems in NPC. The first step is the most difficult one. Once we have one problem in NPC we can get others using reductions. Satisfying boolean formulas. Perhaps surprisingly, a first NP-complete problem has been found, namely the problem of satisfiability for logical expressions. A boolean formula, ϕ, consists of variables, x1 , x2 , . . ., operators, ¬, ∧, ∨, =⇒, . . ., and parentheses. A truth assignment maps each variable to a boolean value, 0 or 1. The truth assignment satisfies if the formula evaluates to 1. The formula is satisfiable if there exists a satisfying truth assignment. Define SAT = {ϕ | ϕ is satisfiable}. As an example consider the formula ψ = (x1 =⇒ x2 ) ⇐⇒ (x2 ∨ ¬x1 ).

If we set x1 = x2 = 1 we get (x1 =⇒ x2 ) = 1, (x2 ∨ ¬x1 ) = 1 and therefore ψ = 1. It follows that ψ ∈ SAT.

88

24 NP-Complete Problems
In this section, we discuss a number of NP-complete problems, with the goal to develop a feeling for what hard problems look like. Recognizing hard problems is an important aspect of a reliable judgement for the difficulty of a problem and the most promising approach to a solution. Of course, for NP-complete problems, it seems futile to work toward polynomial-time algorithms and instead we would focus on finding approximations or circumventing the problems altogether. We begin with a result on different ways to write boolean formulas. Reduction to 3-satisfiability. We call a boolean variable or its negation a literal. The conjunctive normal form is a sequence of clauses connected by ∧s, and each clause is a sequence of literals connected by ∨s. A formula is in 3-CNF if it is in conjunctive normal form and each clause consists of three literals. It turns out that deciding the satisfiability of a boolean formula in 3-CNF is no easier than for a general boolean formula. Define 3-SAT = {ϕ ∈ SAT | ϕ is in 3-CNF}. We prove the above claim by reducing SAT to 3-SAT. S ATISFIABILITY L EMMA . SAT ≤P 3-SAT. P ROOF. We take a boolean formula ϕ and transform it into 3-CNF in three steps. Step 1. Think of ϕ as an expression and represent it as a binary tree. Each node is an operation that gets the input from its two children and forwards the output to its parent. Introduce a new variable for the output and define a new formula ϕ′ for each node, relating the two input edges with the one output edge. Figure 110 shows the tree representation of the formula ϕ = (x1 =⇒ x2 ) ⇐⇒ (x2 ∨ ¬x1 ). The new formula is ϕ′ = (y2 ⇐⇒ (x1 =⇒ x2 )) ∧(y3 ⇐⇒ (x2 ∨ ¬x1 ))

y1 y2 x1 x2 y3 x2 x1

Figure 110: The tree representation of the formula ϕ. Incidentally, ϕ is a tautology, which means it is satisfied by every truth assignment. Equivalently, ¬ϕ is not satisfiable. y2 0 0 0 0 1 1 1 1 x1 0 0 1 1 0 0 1 1 x2 0 1 0 1 0 1 0 1 y2 ⇔ (x1 ⇒ x2 ) 0 0 1 0 1 1 0 1 prohibited ¬y2 ∧ ¬x1 ∧ ¬x2 ¬y2 ∧ ¬x1 ∧ x2 ¬y2 ∧ x1 ∧ x2

y2 ∧ x1 ∧ ¬x2

Table 6: Conversion of a clause into a disjunction of conjunctions of at most three literals each.

follows that y2 ⇐⇒ (x1 =⇒ x2 ) is equivalent to the negation of that disjunction, which by de Morgan’s law is (y2 ∨ x1 ∨ x2 ) ∧ (y2 ∨ x1 ∨ ¬x2 ) ∧ (y2 ∨ ¬x1 ∨ ¬x2 ) ∧ (¬y2 ∨ ¬x1 ∨ x2 ). Step 3. The clauses with fewer than three literals can be expanded by adding new variables. For example a ∨ b is expanded to (a ∨ b ∨ p) ∧ (a ∨ b ∨ ¬p) and (a) is expanded to (a ∨ p ∨ q) ∧ (a ∨ p ∨ ¬q) ∧ (a ∨ ¬p ∨ q) ∧ (a ∨ ¬p ∨ ¬q). Each step takes only polynomial time. At the end, we get an equivalent formula in 3-conjunctive normal form. We note that clauses of length three are necessary to make the satisfiability problem hard. Indeed, there is a polynomial-time algorithm that decides the satisfiability of a formula in 2-CNF. NP-completeness proofs. Using polynomial-time reductions, we can show fairly mechanically that problems are NP-complete, if they are. A key property is the transitivity of ≤P , that is, if L′ ≤P L1 and L1 ≤P L2 then L′ ≤P L2 , as can be seen by composing the two polynomial-time computable functions to get a third one. R EDUCTION L EMMA . Let L1 , L2 ⊆ {0, 1}∗ and assume L1 ≤P L2 . If L1 is NP-hard and L2 ∈ NP then L2 ∈ NPC.

∧(y1 ⇐⇒ (y2 ⇐⇒ y3 )) ∧ y1 .

It should be clear that there is a satisfying assignment for ϕ iff there is one for ϕ′ . Step 2. Convert each clause into disjunctive normal form. The most mechanical way uses the truth table for each clause, as illustrated in Table 6. Each clause has at most three literals. For example, the negation of y2 ⇐⇒ (x1 =⇒ x2 ) is equivalent to the disjunction of the conjunctions in the rightmost column. It

89

A generic NP-completeness proof thus follows the steps outline below. Step 1. Prove that L2 ∈ NP.

Step 2. Select a known NP-hard problem, L1 , and find a polynomial-time computable function, f , with x ∈ L1 iff f (x) ∈ L2 . This is what we did for L2 = 3-SAT and L1 = SAT. Therefore 3-SAT ∈ NPC. Currently, there are thousands of problems known to be NP-complete. This is often con-

It is easy to decide in time O(k 2 nk+2 ) whether or not a graph of n vertices has a clique of size k. If k is a constant, the running time of this algorithm is polynomial in n. For the C LIQUE problem to be NP-complete it is therefore essential that k be a variable that can be arbitrarily large. We use the NP-completeness of finding large cliques to prove the NP-completeness of large sets of pairwise nonadjacent vertices. Let G = (V, E) be an undirected graph. A subset W ⊆ V is independent if none of the vertices in W are adjacent or, equivalently, if E ∩ W = ∅. Given 2 G and an integer k, the I NDEPENDENT S ET problem asks whether or not there is an independent set of k or more vertices. C LAIM . I NDEPENDENT S ET ∈ NPC. P ROOF. It is easy to verify that there is an independent set of size k: just guess a subset of k vertices and verify that no two are adjacent.

NP NPC P

Figure 111: Possible relation between P, NPC, and NP.

sidered evidence that P = NP, which can be the case only if P ∩ NPC = ∅, as drawn in Figure 111. Cliques and independent sets. There are many NPcomplete problems on graphs. A typical such problem asks for the largest complete subgraph. Define a clique in an undirected graph G = (V, E) as a subgraph (W, F ) with F = W . Given G and an integer k, the C LIQUE 2 problem asks whether or not there is a clique of k or more vertices. C LAIM . C LIQUE ∈ NPC. P ROOF. Given k vertices in G, we can verify in polynomial time whether or not they form a complete graph. Thus C LIQUE ∈ NP. To prove property (2), we show that 3-SAT ≤P C LIQUE . Let ϕ be a boolean formula in 3-CNF consisting of k clauses. We construct a graph as follows: (i) each clause is replaced by three vertices; (ii) two vertices are connected by an edge if they do not belong to the same clause and they are not negations of each other. In a satisfying truth assignment, there is at least one true literal in each clause. The true literals form a clique. Conversely, a clique of k or more vertices covers all clauses and thus implies a satisfying truth assignment.

Figure 112: The four shaded vertices form an independent set in the graph on the left and a clique in the complement graph on the right.

We complete the proof by reducing the C LIQUE to the I NDEPENDENT S ET problem. As illustrated in Figure 112, W ⊆ V is independent iff W defines a clique in the complement graph, G = (V, V − E). To prove C LIQUE ≤P 2 I NDEPENDENT S ET, we transform an instance H, k of the C LIQUE problem to the instance G = H, k of the I NDE PENDENT S ET problem. G has an independent set of size k or larger iff H has a clique of size k or larger. Various NP-complete graph problems. We now describe a few NP-complete problems for graphs without proving that they are indeed NP-complete. Let G = (V, E) be an undirected graph with n vertices and k a positive integer, as before. The following problems defined for G and k are NP-complete. An ℓ-coloring of G is a function χ : V → [ℓ] with χ(u) = χ(v) whenever u and v are adjacent. The C HRO MATIC N UMBER problem asks whether or not G has an ℓcoloring with ℓ ≤ k. The problem remains NP-complete

90

for fixed k ≥ 3. For k = 2, the C HROMATIC N UMBER problem asks whether or not G is bipartite, for which there is a polynomial-time algorithm. The bandwidth of G is the minimum ℓ such that there is a bijection β : V → [n] with |β(u) − β(v)| ≤ ℓ for all adjacent vertices u and v. The BANDWIDTH problem asks whether or not the bandwidth of G is k or less. The problem arises in linear algebra, where we permute rows and columns of a matrix to move all non-zero elements of a square matrix as close to the diagonal as possible. For example, if the graph is a simple path then the bandwidth is 1, as can be seen in Figure 113. We can transform the
2 1 3

complete if no set in C contains more than three elements, and there is a polynomial-time algorithm if every set contains two elements. In the latter case, the set system is a graph and a maximum packing is a maximum matching. The C OVERING problem asks whether or not C has k or fewer subsets whose union is V . The problem remains NP-complete if no set in C contains more than three elements, and there is a polynomial-time algorithm if every sets contains two elements. In the latter case, the set system is a graph and the minimum cover can be constructed in polynomial time from a maximum matching. Suppose every element v ∈ V has a positive integer weight, w(v). The PARTITION problem asks whether there is a subset U ⊆ V with w(u) = v∈V −U

0 1

1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0

0

w(v).

4

7

u∈U

5 6

8

0

The problem remains NP-complete if we require that U and V − U have the same number of elements.

Figure 113: Simple path and adjacency matrix with rows and columns ordered along the path.

adjacency matrix of G such that all non-zero diagonals are at most the bandwidth of G away from the main diagonal. Assume now that the graph G is complete, E = , and that each edge, uv, has a positive integer weight, w(uv). The T RAVELING S ALESMAN problem asks whether there is a permutation u0 , u1 , . . . , un−1 of the vertices such that the sum of edges connecting contiguous vertices (and the last vertex to the first) is k or less,
V 2 n−1 i=0

w(ui ui+1 ) ≤

k,

where indices are taken modulo n. The problem remains NP-complete if w : E → {1, 2} (reduction to H AMILTO NIAN C YCLE problem), and also if the vertices are points in the plane and the weight of an edge is the Euclidean distance between the two endpoints. Set systems. Simple graphs are set systems in which the sets contain only two elements. We now list a few NPcomplete problems for more general set systems. Letting V be a finite set, C ⊆ 2V a set system, and k a positive integer, the following problems are NP-complete. The PACKING problem asks whether or not C has k or more mutually disjoint sets. The problem remains NP-

91

25 Approximation Algorithms
Many important problems are NP-hard and just ignoring them is not an option. There are indeed many things one can do. For problems of small size, even exponentialtime algorithms can be effective and special subclasses of hard problems sometimes have polynomial-time algorithms. We consider a third coping strategy appropriate for optimization problems, which is computing almost optimal solutions in polynomial time. In case the aim is to maximize a positive cost, a ̺(n)-approximation algorithm is one that guarantees to find a solution with cost C ≥ C ∗ /̺(n), where C ∗ is the maximum cost. For minimization problems, we would require C ≤ C ∗ ̺(n). Note that ̺(n) ≥ 1 and if ̺(n) = 1 then the algorithm produces optimal solutions. Ideally, ̺ is a constant but sometime even this is not achievable in polynomial time. Vertex cover. The first problem we consider is finding the minimum set of vertices in a graph G = (V, E) that covers all edges. Formally, a subset V ′ ⊆ V is a vertex cover if every edge has at least one endpoint in V ′ . Observe that V ′ is a vertex cover iff V − V ′ is an independent set. Finding a minimum vertex cover is therefore equivalent to finding a maximum independent set. Since the latter problem is NP-complete, we conclude that finding a minimum vertex cover is also NP-complete. Here is a straightforward algorithm that achieves approximation ratio ̺(n) = 2, for all n = |V |. V ′ = ∅; E ′ = E; while E ′ = ∅ do select an arbitrary edge uv in E ′ ; add u and v to V ′ ; remove all edges incident to u or v from E ′ endwhile. Clearly, V ′ is a vertex cover. Using adjacency lists with links between the two copies of an edge, the running time is O(n + m), where m is the number of edges. Furthermore, we have ̺ = 2 because every cover must pick at least one vertex of each edge uv selected by the algorithm, hence C ≤ 2C ∗ . Observe that this result does not imply a constant approximation ratio for the maximum independent set problem. We have |V − V ′ | = n − C ≥ n − 2C ∗ , which we have to compare with n − C ∗ , the size of the maximum independent set. For C ∗ = n , the approxima2 tion ratio is unbounded. Let us contemplate the argument we used to relate C and C ∗ . The set of edges uv selected by the algorithm is

a matching, that is, a subset of the edges so that no two share a vertex. The size of the minimum vertex cover is at least the size of the largest possible matching. The algorithm finds a matching and since it picks two vertices per edge, we are guaranteed at most twice as many vertices as needed. This pattern of bounding C ∗ by the size of another quantity (in this case the size of the largest matching) is common in the analysis of approximation algorithms. Incidentally, for bipartite graphs, the size of the largest matching is equal to the size of the smallest vertex cover. Furthermore, there is a polynomial-time algorithm for computing them. Traveling salesman. Second, we consider the traveling salesman problem, which is formulated for a complete graph G = (V, E) with a positive integer cost function c : E → Z+ . A tour in this graph is a Hamiltonian cycle and the problem is finding the tour, A, with minimum total cost, c(A) = uv∈A c(uv). Let us first assume that the cost function satisfies the triangle inequality, c(uw) ≤ c(uv) + c(vw) for all u, v, w ∈ V . It can be shown that the problem of finding the shortest tour remains NP-complete even if we restrict it to weighted graphs that satisfy this inequality. We formulate an algorithm based on the observation that the cost of every tour is at least the cost of the minimum spanning tree, C ∗ ≥ c(T ). 1 Construct the minimum spanning tree T of G. 2 Return the preorder sequence of vertices in T . Using Prim’s algorithm for the minimum spanning tree, the running time is O(n2 ). Figure 114 illustrates the algorithm. The preorder sequence is only defined if we have

Figure 114: The solid minimum spanning tree, the dotted traversal using each edge of the tree twice, and the solid tour obtained by taking short-cuts.

a root and the neighbors of each vertex are ordered, but

92

we may choose both arbitrarily. The cost of the returned tour is at most twice the cost of the minimum spanning tree. To see this, consider traversing each edge of the minimum spanning tree twice, once in each direction. Whenever a vertex is visited more than once, we take the direct edge connecting the two neighbors of the second copy as a short-cut. By the triangle inequality, this substitution can only decrease the overall cost of the traversal. It follows that C ≤ 2c(T ) ≤ 2C ∗ . The triangle inequality is essential in finding a constant approximation. Indeed, without it we can construct instances of the problem for which finding a constant approximation is NP-hard. To see this, transform an unweighted graph G′ = (V ′ , E ′ ) to the complete weighted graph G = (V, E) with c(uv) = 1 if uv ∈ E ′ , ̺n + 1 otherwise.

ing greedy approach that selects, at each step, the set containing the maximum number of yet uncovered elements. F ′ = ∅; X ′ = X; while X ′ = ∅ do select S ∈ F maximizing |S ∩ X ′ |; F ′ = F ′ ∪ {S}; X ′ = X ′ − S endwhile. Using a sparse matrix representation of the set system (similar to an adjacency list representation of a graph), we can run the algorithm in time proportional to the total size of the sets in the system, n = S∈F |S|. We omit the details. Analysis. More interesting than the running time is the analysis of the approximation ratio the greedy algorithm achieves. It is convenient to have short notation for the dd th harmonic number, Hd = i=1 1 for d ≥ 0. Recall that i Hd ≤ 1 + ln d for d ≥ 1. Let the size of the largest set in the system be m = max{|S| | S ∈ F }. C LAIM . The greedy method is an Hm -approximation algorithm for the set cover problem. P ROOF. For each set S selected by the algorithm, we distribute $1 over the |S ∩ X ′ | elements covered for the first time. Let cx be the cost allocated this way to x ∈ X. We have |F ′ | = x∈X cx . If x is covered the first time by the i-th selected set, Si , then cx = 1 . |Si − (S1 ∪ . . . ∪ Si−1 )|

Any ̺-approximation algorithm must return the Hamiltonian cycle of G′ , if there is one.

Set cover. Third, we consider the problem of covering a set X with sets chosen from a set system F. We assume the set is the union of sets in the system, X = F. More precisely, we are looking for a smallest subsystem F ′ ⊆ F with X = F ′ . The cost of this subsystem is the number of sets it contains, |F ′ |. See Figure 115 for an illustration of the problem. The vertex cover problem

We have |F ′ | ≤ x∈S cx because the optimal S∈F ∗ cover, F ∗ , contains each element x at least once. We will prove shortly that x∈S cx ≤ H|S| for every set S ∈ F. It follows that |F ′ | ≤
Figure 115: The set X of twelve dots can be covered with four of the five sets in the system.

S∈F ∗

H|S| ≤ Hm |F ∗ |,

as claimed. For m = 3, we get ̺ = H3 = 11 . This implies that 6 for graphs with vertex-degrees at most 3, the greedy algorithm guarantees a vertex cover of size at most 11 times 6 the optimum, which is better than the ratio 2 guaranteed by our first algorithm. We still need to prove that the sum of costs cx over the elements of a set S in the system is bounded from above by H|S| . Let ui be the number of elements in S that are

is a special case: X = E and F contains all subsets of edges incident to a common vertex. It is special because each element (edge) belongs to exactly two sets. Since we no longer have a bound on the number of sets containing a single element, it is not surprising that the algorithm for vertex covers does not extend to a constant-approximation algorithm for set covers. Instead, we consider the follow-

93

not covered by the first i selected sets, ui = |S − (S1 ∪ . . . ∪ Si )|, and observe that the numbers do not increase. Let uk−1 be the last non-zero number in the sequence, so |S| = u0 ≥ . . . ≥ uk−1 > uk = 0. Since ui−1 − ui is the number of elements in S covered the first time by Si , we have k cx x∈S = i=1 ui−1 − ui . |Si − (S1 ∪ . . . ∪ Si−1 )|

We also have ui−1 ≤ |Si − (S1 ∪ . . . ∪ Si−1 )|, for all i ≤ k, because of the greedy choice of Si . If this were not the case, the algorithm would have chosen S instead of Si in the construction of F ′ . The problem thus reduces to bounding the sum of ratios ui−1 −ui . It is not difficult ui−1 to see that this sum can be at least logarithmic in the size of S. Indeed, if we choose ui about half the size of ui−1 , for all i ≥ 1, then we have logarithmically many terms, each roughly 1 . We use a sequence of simple arithmetic 2 manipulations to prove that this lower bound is asymptotically tight: k cx x∈S ≤ =

i=1 k

ui−1 − ui ui−1 ui−1 i=1 j=ui

1 . ui−1 +1

We now replace the denominator by j ≤ ui−1 to form a telescoping series of harmonic numbers and get k ui−1

cx x∈S ≤ =

i=1 j=ui k i=1 k

1 j +1  ui 1 1 − j j=1 j

 

ui−1

j=1

= i=1 (Hui−1 − Hui ).

This is equal to Hu0 − Huk = H|S| , which fills the gap left in the analysis of the greedy algorithm.

94

Seventh Homework Assignment
The purpose of this assignment is to help you prepare for the final exam. Solutions will neither be graded nor even collected. Problem 1. (20 = 5 + 15 points). Consider the class of satisfiable boolean formulas in conjunctive normal form in which each clause contains two literals, 2-SAT = {ϕ ∈ SAT | ϕ is 2-CNF}. (a) Is 2-SAT ∈ NP? (b) Is there a polynomial-time algorithm for deciding whether or not a boolean formula in 2-CNF is satisfiable? If your answer is yes, then describe and analyze your algorithm. If your answer is no, then show that 2-SAT ∈ NPC. Problem 2. (20 points). Let A be a finite set and f a function that maps every a ∈ A to a positive integer f (a). The PARTITION problem asks whether or not there is a subset B ⊆ A such that f (b) = b∈B a∈A−B

Problem 4. (20 = 10 + 10 points). Let A ⊆ 2V be an abstract simplicial complex over the finite set V and let k be a positive integer. (a) Is it NP-hard to decide whether A has k or more disjoint simplices? (b) Is it NP-hard to decide whether A has k or fewer simplices whose union is V ? Problem 5. (20 points). Let G = (V, E) be an undirected, bipartite graph and recall that there is a polynomial-time algorithm for constructing a maximum matching. We are interested in computing a minimum set of matchings such that every edge of the graph is a member of at least one of the selected matchings. Give a polynomial-time algorithm constructing an O(log n) approximation for this problem.

f (a).

We have learned that the PARTITION problem is NP-complete. Given positive integers j and k, the S UM OF S QUARES problem asks whether or not A can be partitioned into j disjoint subsets, A = ˙ ˙ ˙ B1 ∪ B2 ∪ . . . ∪ Bj , such that j 2

f (a) i=1 a∈Bi



k.

Prove that the S UM complete.

OF

S QUARES problem is NP-

Problem 3. (20 = 10+10 points). Let G be an undirected graph. A path in G is simple if it contains each vertex at most once. Specifying two vertices u, v and a positive integer k, the L ONGEST PATH problem asks whether or not there is a simple path connecting u and v whose length is k or longer. (a) Give a polynomial-time algorithm for the L ONGEST PATH problem or show that it is NPhard. (b) Revisit (a) under the assumption that G is directed and acyclic.

95

Similar Documents

Free Essay

Survival Rate of New Era University Graduate School Students

...BACKGROUND The goal of the students in the graduate studies is to acquire a unique experience, to develop more professionally, to sharpen a variety of skills, and to learn some new ones. Graduate education represents mastery of an academic discipline. As distinct from undergraduate education, graduate education provides advance knowledge in the field of study that is characterized by specialized training in the discipline’s theory, research, methodology, and critical analysis. Since graduate education is concentrated, learning is more self-directed and involves more individualized instruction and mentoring than does learning for baccalaureate degree. A master’s degree provides student with the skills necessary to generate new knowledge and to apply existing new knowledge. It also provides student with the professional ethics and values of the discipline. Graduate school is training in research. It is for people who love research, scholarship and teaching for their own sake and for the difference they can sometimes make in the world. (Phil. Agre. 1996). Graduate school generally takes five to eight years. The first year is often the worst. It usually consists of an overwhelming amount of structured reading designed to give general background in the basic text of the particular field. Graduate education requires significant investment of financial and personal resources and students should be able to dedicate themselves...

Words: 5717 - Pages: 23

Premium Essay

Graduate Student

...is struggling with delinquency of juvenile. It is our mission to help these young people by working with other agencies in the county who share our mission minimized this problem. Amount of grant requested: $1000, 000 PROJECT NEED Barbu’s Foundation will address the increasing problem of Juvenile delinquency that is brought about by the problems of broken homes, poverty, and single parenthood. The Philadelphia Department of Human Services (DHS) primarily work with delinquent youth and children in the city, but studies have shown that they are overwhelmed by the increased in this phenomena, and about 15% of youth and children in this category fall between the cracks. According to statistics about 20 % of all public school students in Philadelphia suffered from the effect of juvenile delinquency with only 7% of them been recovered by the system successfully, that is by stabilizing...

Words: 1307 - Pages: 6

Premium Essay

Graduate Student

...Note to FI 515 Exam 1 |Access dates: | |7/24/2011 12:00:00 AM to 7/31/2011 11:59:59 PM | | | |Can be reviewed in Gradebook on: | |8/2/2011 11:59:59 PM | | | |Number of times can be taken: | |1 | | | |Time allowed to complete: | |2h, ...

Words: 537 - Pages: 3

Premium Essay

Graduate Student

...Kellough, (2007) p. 192). Compare and contrast three (3) state agencies’ personnel evaluation systems. In comparing and contrasting the three state agencies, I have chosen New Jersey, Kentucky, and Delaware, as it applies to education standards. Therefore, we shall look at the following, for all three states consider items (a-d) that I have outlined below: a. Goals of the work unit/employee: meet with the supervisor at the beginning of the Professional growth; the middle; and at the end; all in order to focus on enhancing an educator’s skills and knowledge; b. Individual Job Responsibilities: focus on the goals of the unit; the continuous growth-focused on an educator’s commitment to continuously improving performance so that the student achievement is continuously enhanced; c. Standards for Satisfactory Performance: define employee performance expectation; Quality assurance (i.e. timeliness, quality, and quantity); focused...

Words: 847 - Pages: 4

Premium Essay

Graduate Student

...QRT2 Task 3 By Javier ollague April 3, 2014 Silvio’s Custom Shoe Design & Restoration is a small company located in Sarasota, Florida, which derives most of its income from the restoration of shoes and related articles such as handbags, belts, attaché cases, suitcases, and leather jackets. The store also provides a limited catalog of men and women shoes. Recently the company has entered the shoe industry designing field wherein the customer may choose from a selection of signature designs or it can create its own design with the assistance of the firm’s professional staff. It has been recommended that the company develops an online business presence by generating a well-designed website that will increase the firm’s visibility in the market as its strategy calls for the expansion of its products and services. The section below contains the new website’s structural site map and a design mock-up of its constituent pages. (see attachments). Project Scope and Estimated costs The new website for Silvio’s Custom Shoe Design & Restoration will be built using the Fortune3 platform. This turnkey solution provides hosting, and e-commerce capabilities such as an online store, shopping cart, and checkout. Silvio’s will subscribe to Fortune3 Gold plan based on a monthly fee. The company will retain the services of Gecko Media in the designing and the building of its website and to make ongoing changes once it is up and running. Fortune3 online store checkout module requires...

Words: 512 - Pages: 3

Premium Essay

Graduate Student

...Background or Introduction Enterprise Resource Planning system or ERP has been adopted by a lot of companies. It is necessary to adopt an ERP system in order to maintain control of operations and to compete with other peers. Successful implementation of Enterprise Resource Planning system allows an organization to achieve business innovations and provide an integrated view of all business processes in an entity. However, an ERP implementation is very money costing and could be risky for all businesses; sometimes it is even more challenging for small businesses (Malhotra, Temponi, 2010). The number of companies that have implemented ERP systems and integrated with enterprise systems completely and successfully is limited. Integration between ERP and Enterprise Systems So, what is the difference between ERP and enterprise systems? ERP is actually business application software that offers solutions to different departments in a company as multifunction modules; whereas Enterprise Systems can include ERP, Supply Chain Management (SCM), Customer Relationship Management (CRM), etc. Enterprise Resource Planning software is implemented to enhance a company’s internal functions; whereas Enterprise Systems are implemented to better run the external functions and suppliers, customers and so on. Consequently, there is a necessity to well integrate ERP and Enterprise Systems. In recent years, many researchers and practitioners have considered the continuing development of enterprise...

Words: 1386 - Pages: 6

Premium Essay

Graduate Student

...Formulas to Know for the Exam Although we do not suggest you memorize a lot of information to prepare for the exam, the following formulas are some of the items you do need to memorize, as well as understand. There will not be a lot of questions requiring you to use these formulas, but it will be helpful to be able to apply these at a moment’s notice. If you are not comfortable with math, you should be happy to hear that you can know none of these formulas and still pass the exam! The most important formulas are those relating to earned value, as earned value is a key component of monitoring and controlling. Formulas to Know for the Exam Title PMP® Exam Prep Chapter Reference Formula Present Value (PV) PV = FV (1 + r)n Integration Management Expected Activity Duration (Triangular Distribution)* P+M+O 3 Time Management Expected Activity Duration (Beta Distribution)* P + 4M + O 6 Time Management P−O 6 Time Management Beta EAD +/− SD Time Management LS − ES, or LF − EF Time Management Cost Variance (CV) EV − AC Cost Management Schedule Variance (SV) EV − PV Cost Management Cost Performance Index (CPI), or Cumulative CPI (CPIC) EV EVC or AC ACC Cost Management Schedule Performance Index (SPI) EV PV Cost Management Estimate at Completion (EAC) AC + Bottom-up ETC Cost Management Estimate at Completion (EAC) BAC CPIC Cost Management Estimate at Completion...

Words: 338 - Pages: 2

Premium Essay

Graduate Student

...Questions: 1Introduction to OM: OM Q&A -1 9/26/2013 1) Identify three or more operations-related tasks carried out by Hard Rock Café. (Far East department) Answer: Providing custom meals; designing, testing, and costing meals; acquiring, receiving , and storing supplies; recruiting and training employees; preparing employee schedules; designing efficient restaurant layouts. 2) Identify two operations-related tasks carried out by Hard Rock Café (changed to McDonalds). Match each to its area of the Ten Critical Decisions. Answer: Providing custom meals–design of goods and services; designing, testing, and costing meals–design of goods and services; acquiring, receiving , and storing supplies–supply- chain management; recruiting and training employees–human resources , job design and work measurement; preparing employee schedules–intermediate and short-term scheduling; designing efficient restaurant layouts–layout strategy. 3) Define operations management. Will your definition accommodate both manufacturing and service operations? Answer: Operations management can be defined as the management of all activities directly related to the creation of goods and/or services through the transformation of inputs into outputs. Yes. 4) Why are services typically more difficult to standardize, automate, and make efficient? Answer: Services typically require customer interaction, which makes it difficult to standardize, automate...

Words: 5505 - Pages: 23

Free Essay

Graduate Student

...in order to stretch their minds or to mature to where critical thinking overcomes impulse. College is the time to embrace the joy of discovery and the world of ideas, passions and relationships. Expextation Too few jobs satisfy the soul. Higher education should help to address that shortcoming. But for that to happen, we have to learn how to combine the practicality of learning with the joy of exploration. Why Get a Doctorate Degree in Business Administration? By Wendel Clark, eHow Contributor * * * Share * * Print this article Earning a doctoral degree in business administration can be worthwhile. A doctoral degree in business administration is the highest level of education possible in the field of business. Such a high level of education in a specialized field offers many benefits to a person who has attained this level of education. There are a few key benefits that should be considered by those who are considering this educational path. Other People Are Reading * DBA Vs. Ph.D. * Doctorate Degrees That Don't Require a Dissertation 1. Demand * One of the strongest reasons to get a doctorate degree in business administration is that there is a strong demand in the workforce for people with this degree. According to the executive director of doctoral programs at Harvard Business School, there is a shortage of candidates with doctoral degrees in business fields to fill positions at business schools. Because there is a demand...

Words: 516 - Pages: 3

Premium Essay

Graduate Student

...CASE 1.8 CRAZY EDDIE, INC. Synopsis Eddie Antar opened his first retail consumer electronics store in 1969 near Coney Island in New York City. By 1987, Antar's firm, Crazy Eddie, Inc., was a public company with annual sales exceeding $350 million. The rapid growth of the company's revenues and profits after it went public in 1984 caused Crazy Eddie's stock to be labeled as a "can't miss" investment by prominent Wall Street financial analysts. Unfortunately, the rags-to-riches story of Eddie Antar unraveled in the late 1980s following a hostile takeover of Crazy Eddie, Inc. After assuming control of the company, the new owners discovered a massive overstatement of inventory that wiped out the cumulative profits reported by the company since it went public in 1984. Subsequent investigations by various regulatory authorities, including the SEC, resulted in numerous civil lawsuits and criminal indictments being filed against Antar and his former associates. Following the collapse of Crazy Eddie, Inc., in the late 1980s, regulatory authorities and the business press criticized the company's auditors for failing to discover that the company's financial statements had been grossly misstated. This case focuses on the accounting frauds perpetrated by Antar and his associates and the related auditing issues. Among...

Words: 2922 - Pages: 12

Free Essay

Graduate Student

...XYZ MEDICAL CENTRE Boston Sep 24th, 2006 MEMORANDUM TO: Alex Adams, Ph.D., Director, Institutional Review Board XYZ Medical Centre XYZ Medical Centre at Boston Boston, Ma FROM: John Johnson Investigator, Institutional Review Board (IRB) XYZ Medical Center at Boston Boston SUBJECT: Violations observed during the FDA inspection of XYZ Institutional Review Board from August 21 to August 25 1, 2006 I am writing this letter to set out all the non- compliance issues identified during the FDA inspection of XYZ Institutional Review Board (IRB). These violations are pertaining to applicable provisions of Title 21, Code of Federal Regulations (21 C.F.R.) Part 56 Institutional Review Boards, Part 50-Protection of Human Subjects, and Part 812 Investigational Device. The compliance issues can be summarized as follow: 1- The IRB failed to establish, maintain, and follow adequate written procedures for conducting the review of research. (21 CFR 56.108, 21 CFR 56.110(c), and...

Words: 531 - Pages: 3

Premium Essay

Graduate Student

...Case Analysis- Orange Inc. Case Analysis- Orange Inc. 2014 ACG5075, Section 11/27/2014 2014 ACG5075, Section 11/27/2014 A. Pricing Strategy Orange’s pricing strategy involves higher volumes as a result of lower profit margins in the ‘lower’ segment. This is shown in the table below where the ‘upper’ segment represents 84% of the total profit and the ‘lower’ segment represents 16% of the total profit. There are pros and cons to their pricing strategy. A pro of the pricing strategy is the increasing sales volume will allow the opportunity to increase customers, and therefore increase market share in the yacht industry. This is under the assumption people will buy. A con of the pricing strategy is Orange is not optimizing their profit. They are also anticipating how customers will react to price changes, which is a risk that may cause an overstock of yachts if customers do not buy. The risk of loss is greater with a higher volume because there is a chance for an increased amount of inventory to go obsolete. B. Benchmarking 1. *Return on equity (ROE): measures the rate of return on the ownership interest of the common stock owners. It measures a firm's efficiency at generating profits from every unit of shareholders' equity. *Return on sales (ROS): a ratio widely used to evaluate an entity's operating performance. It is also known as "operating profit margin" or "operating margin". *Asset Turnover: In accounting, the Inventory turnover...

Words: 2111 - Pages: 9

Premium Essay

Current Graduate Student

...The Case Study of TerraCog Global Positional Systems Xu Yang, 005767016 The company of TerraCog enjoys the flat organization juts like the typical law firm.The benefit of this structure is better communication between departments and efficient examination through production process. However, it also leads to the serious situation which TerraCog is facing now. Case question one: Richard Fiero as the president of this company regard quality reputation and market shares equally important which leads to divarication between sale department and design&development depattment. (Ed pryor is encouraged to reoccupy the market while Allen Roth is allowed to focus on product quality.) Besides the alternative goals president has set, all managers hold different opinions toward the price point and they are speaking for both their department and themselves. For example, Ed Pryor ask to proceed Aerial in order to hit the target of sales department. However, Ed make one-sided agreement of price point without negotiation with production department. Also, Tony Barren as the director of production department is on thin ice as the result of last failure. He decide to make more comprehensive cost estimate to cover the potential risks. According to the article, Allen Roth really wish to get the promotion of the vacant position. Thus, She is eager to design a functional product for company. As we can see from above analysis, every manager has individual...

Words: 812 - Pages: 4

Free Essay

Meowzy Dung

...his requests and orders. Is he trying to talk to the prop master, the set designer, the actors, the make-up artists? All of them are part of different departments. But all of them, in the end, have influence in the mise-en-scène. In the academic realm, the term mise-en-scène is always invoked when the overall look and feel of a movie is under discussion. Students taking Film Analysis should be quite familiar with the term. Even though many professionals are involved in its creation, the director is the one that oversees the entire mise-en-scène and all of its elements. Not just that, but during the early stages of pre-production, the director or his AD sits down with set designers, prop masters, location managers, costume designers, and scenic artists to determine the look and feel intended. In some instances, the mise-en- scène is used to evoke lasting feelings throughout the movie and not just for selected scenes. In the German expressionist film The Cabinet of Dr. Caligari (1920), distorted shapes and claustrophobic scenery is implemented to disturb the audience and enhance the horror. Mike Nichols’ The Graduate (1967) has been praised by its amazing, exciting, and multi-layered visual...

Words: 479 - Pages: 2

Free Essay

The Graduate

...Mike Nichol’s “The Graduate” A Real Graduate In Mike Nichol’s, “The Graduate” Ben (a shy, antisocial, college graduate) who attended school on the east coast for four years -more than likely to get away from his parents- returns home to his family and a life full of high expectations, confusion, and a love triangle. Most parents who want their children to be successful, ideally, would not want to have full control over their child’s career path. Unfortunately, this is not the case for Ben. The day he came home his parents threw him a graduation party -considering none of his real friends were invited- that was really for Mr. Braddock’s (Ben’s father) friends and business partners to associate. Though, the average 20 year old would be thrilled to receive a car for graduating college (or on any occasion for that matter). When Ben’s parents surprised him with an Alfa Romeo or as his dad’s guests might say “A wop-job” he didn’t seem too excited. After he became irritated with all the bogus compliments about graduating college, he raced up to his room to escape the madness. Thinking he was alone, Mrs. Robinson (one of Mr. Braddock’s business partners) walked into Ben’s room intentionally and exclaimed, “Oh I’m sorry, I thought this was the bathroom.” Afraid of her impeccable presence and his lack of experience with a woman, Ben becomes extremely nervous as she bluntly invited herself to sit on his bed. But her demanding tone of voice and her promiscuous personality was just the first...

Words: 1455 - Pages: 6