# RA.PS2

My Solution for Combinatorics for the Fall 2015 semester, Problem Set 2. Keep your Academic Integrity plz.

# Randomized Algorithms: Problem Set 2

Himemiya

## 1 Problem 1

### 1.1

Karger’s paper[1] notices me that we should use Karger’s algorithm to calculate the $\alpha$-min-cut rather than calculate the probability Karger’s algorithm exactly returns an $\alpha$-min-cut.

Our algorithm runs as follows way: record the vertex which has the minimum degree (choose arbitrary one if there are multiple vertices have minimum degree) and the edges incident to it before and after each iteration, then return the edges which incident to the vertex which has the minimum degree over all the $n-1$ records.

Let $c$ be the size of min-cut. Without lost of generality, we can assume the minimum degree of any graph during the iterative procedure is $\alpha c$ (note that the $\alpha$-min-cut will be returned trivially otherwise). Since $\Pr[\text{Karger's algorithm retuns min-cut}]\leq\Pr[\text{Karger's algorithm % returns an \alpha-min-cut}]$, we calculate the former to lower bound the latter. Every vertex of the $n$ vertice graph has degree at least $\alpha c$, thus the graph has at least $\alpha cn/2$ edges, and the probability a min-cut edge is contracted is at most $2/(\alpha n)$. Therefore

 $\displaystyle\Pr[\text{Karger's algorithm returns min-cut}]$ $\displaystyle\geq\prod_{i=n}^{3}(1-\frac{2}{\alpha i})$ $\displaystyle=\exp(\sum_{i=3}^{n}\ln(1-\frac{2}{\alpha i}))$ $\displaystyle=\Omega(\exp(-\sum_{i=3}^{n}\frac{2}{\alpha i}))$ $\displaystyle=\Omega(\exp(-\frac{2}{\alpha}\ln n))$ $\displaystyle=\Omega(n^{-2/\alpha}).$

### 1.2

For any $\alpha$-min-cut, the probability that the $i$-th contraction contracts an edge in the cut is at most $2\alpha/(n-i+1)$. The probability that all the edges in the cut survive in the first $m$ times contraction is at least

 $\displaystyle\prod_{i=n}^{n-m+1}(1-\frac{2\alpha}{n-i+1})$ $\displaystyle=$ $\displaystyle\exp(\sum_{i=n}^{n-m+1}\ln(1-\frac{2\alpha}{n-i+1})).$

Notice that the $\ln$ factor with be undifned when $2\alpha. This implies we should stop when there remains $2\alpha$ vertices. The probability that the edges in the cut survive in the moment is at least

 $\prod_{i=n}^{2\alpha}(1-\frac{2\alpha}{i})={\left({{n}\atop{2\alpha}}\right)}^% {-1}.$

We then try to parition the remained vertices into two parition. Since there are $2^{2\alpha}/2$ ways to parition the vertices, the probability the cut is returned is $2^{1-2\alpha}$. In conclusion, any $\alpha$-min-cut is returned with probablity $2^{1-2\alpha}{\left({{n}\atop{2\alpha}}\right)}^{-1}$, there are at most $O(2^{2\alpha-1}{\left({{n}\atop{2\alpha}}\right)})$ $\alpha$-min-cuts.

## 2 Problem 2

### 2.1

Let $X_{1},X_{2},\cdots,X_{n}$ be the coin flipping results. For any 3 $X_{i},X_{j},X_{k}$, let $Y_{a}=X_{i}\oplus X_{j},Y_{b}=X_{j}\oplus X_{k},Y_{c}=X_{i}\oplus X_{k}$. Then

 $\displaystyle\Pr[Y_{a}=0]=\Pr[Y_{b}=0]=\Pr[Y_{c}=0]=\frac{1}{2}$ $\displaystyle\Pr[Y_{a}=0\land Y_{b}=0\land Y_{c}=0]=\Pr[X_{i}=X_{j}=X_{k}]=% \frac{2}{2^{3}}\neq\frac{1}{2^{3}}.$

Therefore the $Y_{i}$ are not mutually independent.

### 2.2

Let $Y_{i}=X_{a}\oplus X_{b},Y_{j}=X_{c}\oplus X_{d}$. If $\{a,b\}\cap\{c,d\}=\emptyset$, $Y_{i}$ are trivially independent with $Y_{j}$. Assuming $a=d$ w.l.o.g.,

 $\displaystyle\Pr[Y_{i}=0]=\Pr[X_{a}=X_{b}]=\frac{1}{2}=\Pr[X_{a}=X_{c}]=\Pr[Y_% {j}=0]$ $\displaystyle\Pr[Y_{i}=0\land Y_{j}=0]=\Pr[X_{a}=X_{b}=X_{c}]=\frac{2}{2^{3}}=% \frac{1}{4}=\Pr[Y_{i}=0]\times\Pr[Y_{j}=0].$

Similarly we can prove $\forall p,q\in\{0,1\},\Pr[Y_{i}=p\land Y_{j}=q]=\Pr[Y_{i}=p]\times\Pr[Y_{j}=q]$, therefore the $Y_{i}$ are pairwise independent. Note

 $\displaystyle\mathbf{E}[Y_{i}Y_{j}]=\Pr[Y_{i}=Y_{j}=1]$ $\displaystyle\mathbf{E}[Y_{i}]=\mathbf{E}[Y_{j}]=\Pr[Y_{i}=1]=\Pr[Y_{j}=1].$

Since $Y_{i}$ and $Y_{j}$ are independent, $\Pr[Y_{i}=Y_{j}=1]=\Pr[Y_{i}=1]\times\Pr[Y_{j}=1]$, i.e. $\mathbf{E}[Y_{i}Y_{j}]=\mathbf{E}[Y_{i}]\mathbf{E}[Y_{j}]$.

### 2.3

 $\displaystyle\mathbf{Var}[Y]$ $\displaystyle=\sum_{i=1}^{m}\mathbf{Var}[Y_{i}]+\sum_{i\neq j}\mathbf{Cov}(Y_{% i},Y_{j})$ $\displaystyle=\sum_{i=1}^{m}(\mathbf{E}[Y_{i}^{2}]-\mathbf{E}[Y_{i}]^{2})+0$ $\displaystyle=m/4.$

### 2.4

Due to the Chebyshev’s inequality,

 $\forall t>0,\Pr[|X-\mathbf{E}[X]|\geq t]\leq\frac{\mathbf{Var}[X]}{t^{2}}.$

Let $t=n$ and $X=Y$,

 $\displaystyle\Pr[|Y-\mathbf{E}[Y]|\geq n]$ $\displaystyle\leq\frac{\mathbf{Var}[Y]}{n^{2}}$ $\displaystyle=\frac{m/4}{n^{2}}$ $\displaystyle=\frac{n-1}{8n}.$

## 3 Problem 3

### 3.1

Let random variables $X_{i}$ denote the bin $i$-th ball is hased into, indicator random variables $Y_{i,j}$ denote whether $i$-th ball and $j$-th ball are hashed into same bin, $Y=\sum_{i be # of collision. Then due to the linearity of expectation, $\mathbf{E}[Y]=\sum_{i. Since the hash function is pairwise independent, $\mathbf{E}[Y_{i,j}]=1/n$,

 $\displaystyle\mathbf{E}[Y]={\left({{n}\atop{2}}\right)}/n\leq n/2.$
 $\displaystyle\Pr[\text{the maximum load is larger than C\sqrt{n}}]$ $\displaystyle\leq\Pr[Y\geq C^{2}n]$ $\displaystyle\leq\frac{\mathbf{E}[Y]}{C^{2}n}$ (Markov’s Inequality) $\displaystyle=\frac{1}{2C^{2}}.$

Let $C\geq\sqrt{50}$, we have $\Pr[\text{the maximum load is larger than \sqrt{50n}}]\leq 0.01$, i.e. $\Pr[\text{the maximum load is at most O(\sqrt{n})}]\geq 0.99$.

To generalize this argument, we define indicator random variables $Y_{I\in{\left({{[n]}\atop{k}}\right)}}$ denote whether the $k$ balls in $I$ collide, $Y=\sum_{I\in{\left({{[n]}\atop{k}}\right)}}Y_{I}$ be # of $k$-collision. Then due to the linearity of expectation, $\mathbf{E}[Y]=\sum_{I\in{\left({{[n]}\atop{k}}\right)}}\mathbf{E}[Y_{I}]={% \left({{n}\atop{k}}\right)}/n^{k-1}$.

 $\displaystyle\Pr[\text{the maximum load is larger than m}]$ $\displaystyle\leq\Pr[Y\geq{\left({{m}\atop{k}}\right)}]$ $\displaystyle\leq\frac{\mathbf{E}[Y]}{{\left({{m}\atop{k}}\right)}}$ (Markov’s Inequality) $\displaystyle=\frac{{\left({{n}\atop{k}}\right)}}{n^{k-1}{\left({{m}\atop{k}}% \right)}}$ $\displaystyle\leq\frac{(\frac{ne}{k})^{k}}{n^{k-1}(\frac{m}{k})^{k}}$ (Stirling’s approximation) $\displaystyle=n(\frac{e}{m})^{k}.$

Let $m=e\sqrt[k]{100n}$, we have $\Pr[\text{the maximum load is larger than e\sqrt[k]{100}\sqrt[k]{n}}]\leq 0.01$, i.e. $\Pr[\text{the maximum load is at most O(\sqrt[k]{n})}]\geq 0.99$.

### 3.2

$O(\sqrt{n})$ can not be improved if the only thing we assume is $2$-universality and we cosider the number of collisions only. Since the only thing we know will be the expected number of collisions.

Discarding the collisions, the only thing we know will be the marginal distribution of the load of arbitrary single bin which is deduced by Chebyshev’s Inequality. But we still can not characterize correlations of the loads betwesn different bins. By the way, we can prove a same bound by the union bound.

Therefore I believe the bound can not be improved if the only thing we assume about the hash function is the $2$-universality.

## 4 Problem 4

### 4.1

We uniformly and independently assign each vertex to $S$ or $\bar{S}$ at random. For any edge $(i,j)$, the probability $i$ is partitioned into $S$ and $j$ is partitioned into $\bar{S}$, i.e. the probability $w_{ij}$ is summed into the solution, is $1/4$. Then due to the linearity of expectation,

 $\displaystyle\mathbf{E}[\sum_{(u,v)\in E\atop u\in S,v\notin S}w_{uv}]=\sum_{(% u,v)\in E}w_{uv}\Pr[u\in S\land v\notin S]=\frac{1}{4}\sum_{(u,v)\in E}w_{uv}% \geq\frac{1}{4}\text{OPT}.$

Therefore our random sampling algorithm is $\frac{1}{4}$-approximation.

### 4.2

Consider $x_{i}$ as a 0-1 indicator which indicates whether vertex $i$ is partitioned to $S$. Since $\forall(i,j)\in E,z_{ij}\leq x_{i}\land z_{ij}\leq 1-x_{j}\land 0\leq z_{ij}\leq 1$, $z_{ij}\in\{0,1\}$ and $z_{ij}=1$ if and only if $x_{i}=1\land x_{j}=0$. Therefore $z_{ij}$ is a 0-1 indicator which indicates whether edge $(i,j)$ is going from $S$ to $\bar{S}$. Finally the objective function is maximized to achieve the optimal value.

### 4.3

To construct an LP relaxation of the above integer programming, the constraint $x_{i}\in\{0,1\},\forall i\in V$ is relaxed to $0\leq x_{i}\leq 1,\forall i\in V$. Let $x_{i}^{*}$ and $z_{ij}^{*}$ be the fractional optimal solutions to the linear programming. Observe that

• $x_{i}^{*}\in[0,1],\forall i\in V$;

• $z_{ij}^{*}=\min\{x_{i}^{*},1-x_{j}^{*}\}$.

Notice that the function $z(x,y)=min\{x,1-y\}$ is symmetric about line $x+y=1$, we try to make our function $g(x,y)=f(x)f(1-y)=(ax+b)(1-ay-b)$ ($g(x_{i}^{*},x_{j}^{*})$ is the probability the edge $(i,j)$ is going from $S$ to $\bar{S}$) have this property too. This implies $g(x,y)=g(1-y,1-x)$, i.e.

 $(ax+b)(1-ay-b)=(a(1-y)+b)(1-a(1-x)-b).$

Then we have $a=1-2b$, and

 $g(x,y)=b+x-3bx-by-xy+2b^{2}x+2b^{2}y-b^{2}-4b^{2}xy+4bxy.$

Due to the symmetry, we focus on the area $x\in(0,1),y\in(0,1-x)$ only. Note that $z(x,y)=x$ in this area. Our goal, i.e. the approximation ratio function, becomes

 $\displaystyle\max_{0\leq b\leq 1/2}\min_{0 $\displaystyle=$ $\displaystyle\max_{0\leq b\leq 1/2}\min_{0

By brute force enumerating the value of $b$, we found that when $b=1/4$, the approximation ratio will be exact $1/2$. Therefore we get $a=1/2,b=1/4$ and a good approximation ratio $1/2$.

## 5

### 5.1

Let $x_{i}$ be an indicator which indicates whether $u_{i}\in V$, the constraint $\sum_{i:u_{i}\in S_{j}}x_{i}\geq 1,1\leq j\leq m$ guarantees every $S_{i}\in S$ intersects with $V$. Finally the objective function is maximized to achieve the optimal value.

### 5.2

Relax the constraint $x_{i}\in\{0,1\},1\leq i\leq n$ to $x_{i}\in[0,1],1\leq i\leq n$. Let $x_{i}^{*}$ be the fracional optimal solutions to the linear programming. Try to independently choose each element at random with probability $x_{i}^{*}$. For some subset $S_{j}$, we assume it contains elements $y_{1},y_{2},\cdots,y_{k}$ w.l.o.g.. Then $1-\prod_{i=1}^{k}(1-y_{i})$ is the probability the subset $S_{j}$ is hit. By the linear programming constraints,

 $\sum_{i=1}^{k}y_{i}\geq 1.$

Then the value of $1-\prod_{i=1}^{k}(1-y_{i})$ is minimized when all $y_{i}=1/k$. Thus the probability that $S_{j}$ is hit is

 $1-\prod_{i=1}^{k}(1-y_{i})\geq 1-(1-1/k)^{k}.$

Note that R.H.S. is monotonic decreasing when $k\geq 1$, and $\lim_{k\to+\infty}1-(1-1/k)^{k}=(e-1)/e>1/2$. Therefore every subset is hit with probability at least $1/2$. This implies we may repeat this random choosing procedure $O(\log m)$ times.

 $\displaystyle\Pr[\lor_{i}S_{i}\text{was not hit after \log_{2}2m times % random choosing}]$ $\displaystyle (Union bound) $\displaystyle=1/2.$

Therefore after $\log_{2}2m$ times random choosing procedures, the probability all subset was hit is at least $1/2$. Besides, the expected increment of weight after each turn of the random choosing procedure is $\sum_{i\in V}w_{i}x_{i}^{*}\leq\mathrm{OPT}$. In conclusion, our algorithm succeeds with probability at least $1/2$ and return weight $O(\log m)\mathrm{OPT}$ expectedly.

## References

• 1 Karger, David R. ”Global Min-cuts in RNC, and Other Ramifications of a Simple Min-Cut Algorithm.” SODA. Vol. 93. 1993.