Problem 1.

First, generate $N$ random points uniformly distributed within the unit square $[0,1] \times[0,1]$. These points will form the vertex set $V$ of your graph $G=(V, E)$. Then randomly generate edges so that the probability of an edge connecting two vertices $v$ and $v^{\prime}$ is given as:
$$P\left(\left\{v, v^{\prime}\right\} \in E\right)= \begin{cases}H, & \text { when } \rho\left(v, v^{\prime}\right)<D \\ H e^{A\left(D-\rho\left(v, v^{\prime}\right)\right)} & \text { when } \rho\left(v, v^{\prime}\right) \geq D\end{cases}$$
Here $\rho\left(v, v^{\prime}\right)$ denotes the Euclidean distance between the points $v$ and $v^{\prime}$. $0 \leq H \leq 1, A>0$, and $D>0$ are constants.

Submit the printout of your algorithm and the graph plots with parameters set to $N=1000, H=0.8, A=25$ and different values of $D=0, D=0.1, D=0.5, D=1$. You may use gnuplot for the visualization.

Implement Dijkstra algorithm from Section $6.4$ (pp.189-202) with $k$-ary heap (see Problem $6.16$ for the details). You will need to implement your own $k$-ary heap functionality (see Problem 5.23). The input graphs for the Dijkstra algorithm will come from the first part of this homework: you will use undirected graphs with edge lengths generated randomly so that they are distributed uniformly in $[0,1]$.

Submit a printout of your program together with two of example runs on some simple graphs: assume that the first vertex is the starting one, then the result will be the distance assignment from that vertex.

Analyze the asymptotic runtime of your algorithm both theoretically and experimentally. Plot the runtime of the algorithm varying the number of vertices and edges in your graph (one way to leave the number of edges approximately constant is to double the number of vertices and at the same time divide the constant $H$ by four in your graph generation procedure). Change the value of the heap parameter $k$ between 2 and $k^{\prime}=\max (2,\lfloor a / n\rfloor)$ (do it gradually in ten increments of the size $\left\lfloor\left(k^{\prime}-2\right) / 10\right\rfloor .$ Do you see the performance boost as expected from the description in Problem 6.16? Plot the resulting performance plot (runtime against $k$ ) for a graph with the number of vertices greater than 5,000 and the number of edges greater than 100,000 . Try to match your asymptotic analysis and the observed performance.

Problem 2.

$\mathrm{D}(5 \mathrm{pts})$ Let $C_{4}=\max \left(C_{2}, C_{3}\right)$. Prove by induction that $T^{\prime}(n, n) \leq C_{4}(n+1)$.
$\mathrm{E}(5 \mathrm{pts})$ Prove by induction that
$$T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c} n+1 \\ k \end{array}\right)$$
$\mathrm{F}(5 \mathrm{pts})$ Prove that for $k \leq\lfloor n / 2\rfloor$ we have
$$T^{\prime}(n, k) \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right)$$
Conclusion Thus, we have proven that
$$T(n, k) \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right)-C_{1} \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right)$$
that is the time per one generated $k$-subset is constant (when $k \leq$ $\lfloor n / 2\rfloor) .$
EXTRA(10pts) What happens when $\lfloor n / 2\rfloor<k<n ?$ Find an upper bound on the time per one generated $k$-subset. Is it $O(1) ? O(n) ? O(k)$ ?

Proof .

$\mathrm{D}(5 \mathrm{pts})$ Let $C_{4}=\max \left(C_{2}, C_{3}\right)$. Prove by induction that $T^{\prime}(n, n) \leq C_{4}(n+1)$.
Solution: Base case is $n=0$, for which we get $T^{\prime}(0,0)=C_{3} \leq C_{4}$. Suppose that $T^{\prime}(n, n) \leq C_{4}(n+1)$. From recurrence we get $T^{\prime}(n+$ $1, n+1)=T^{\prime}(n, n)+T^{\prime}(n, n+1)=T^{\prime}(n, n)+C_{2} \leq C_{4}(n+1)+C_{2} \leq$
$C_{4}(n+2)$, which is exactly what we need. $\mathrm{E}(5 \mathrm{pts})$ Prove by induction that
$$T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c} n+1 \\ k \end{array}\right) .$$
Solution: Let’s be careful here. The statement $M(n)$ will be: for all $k$ such that $0 \leq k \leq n+1$ we have $T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c}n+1 \\ k\end{array}\right)$.
Base case is $M(0)$, so that $n=0$ for which we get
$$T^{\prime}(0,0) \leq C_{4}=C_{4}\left(\begin{array}{l} 1 \\ 0 \end{array}\right) .$$
Induction step assumes that $M(n)$ is true. Let’s prove $M(n+$
1) based on that. For $k=0$ we have $T^{\prime}(n, 0)=C_{3} \leq C_{4}\left(\begin{array}{c}n+2 \\ 0\end{array}\right)$.
Similarly, for $k=n+2$ we have $T^{\prime}(n+1, n+2)=C_{2} \leq C_{4}\left(\begin{array}{c}n+2 \\ n+2\end{array}\right)$ as
required. Consider $0<k<n+2$ now. We have:
$T^{\prime}(n+1, k)=T^{\prime}(n, k-1)+T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c}n+1 \\ k-1\end{array}\right)+C_{4}\left(\begin{array}{c}n+1 \\ k\end{array}\right)=C_{4}\left(\begin{array}{c}n+2 \\ k\end{array}\right)$
Here we used the inductive assumption and the binomial coefficients property. Thus, $M(n+1)$ is proven. $\mathrm{F}$ (5pts) Prove that for $k \leq\lfloor n / 2\rfloor$ we have
$$T^{\prime}(n, k) \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right)$$
Solution: We know that
$$T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c} n+1 \\ k \end{array}\right) .$$
But $\left(\begin{array}{c}n+1 \\ k\end{array}\right)=\left(\begin{array}{c}n \\ k\end{array}\right)+\left(\begin{array}{c}n \\ k-1\end{array}\right)$. Moreover, $\left(\begin{array}{c}n \\ k-1\end{array}\right) \leq\left(\begin{array}{l}n \\ k\end{array}\right)$ when $k \leq\lfloor n / 2\rfloor$.
It follows that
$$T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c} n+1 \\ k \end{array}\right)=C_{4}\left(\left(\begin{array}{c} n \\ k \end{array}\right)+\left(\begin{array}{c} n \\ k-1 \end{array}\right)\right) \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right) .$$

$$T(n, k) \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right)-C_{1} \leq 2 C_{4}\left(\begin{array}{l} n \\ k \end{array}\right)$$
that is the time per one generated $k$-subset is constant (when $k \leq$ $\lfloor n / 2\rfloor)$
EXTRA(10pts) What happens when $\lfloor n / 2\rfloor<k<n ?$ Find an upper bound on the time per one generated $k$-subset. Is it $O(1) ? O(n) ? O(k) ?$

Solution: We know that
$$T^{\prime}(n, k) \leq C_{4}\left(\begin{array}{c} n+1 \\ k \end{array}\right) .$$
We can use the fact that
$$\left(\begin{array}{c} n+1 \\ k \end{array}\right)=\frac{n+1}{n+1-k}\left(\begin{array}{l} n \\ k \end{array}\right)$$
The factor $(n+1) /(n+1-k)$ is the upper bound asymptotics of runtime per single generated subset. We can say that $(n+1) /(n+$ $1-k)=O(n)$