Skip to content

Commit 1587a3c

Browse files
committed
How am I supposed to pass this subject?
1 parent a466832 commit 1587a3c

6 files changed

+262
-1
lines changed

.vscode/settings.json

+3
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
"CGHA",
5151
"cheatsheet",
5252
"chebyshev's",
53+
"chernoff",
5354
"Ciphertext",
5455
"ciphertexts",
5556
"Civita",
@@ -107,6 +108,7 @@
107108
"GNFS",
108109
"grayscale",
109110
"Greibach",
111+
"gries",
110112
"Grzegorz",
111113
"halftoning",
112114
"hasse",
@@ -185,6 +187,7 @@
185187
"Michal",
186188
"Michał",
187189
"microkernel",
190+
"misra",
188191
"mkaleta",
189192
"Mobius",
190193
"monic",
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# approximation algorithms
2+
3+
Algorithms which relax the requirement of finding the optimal solution while preserving reliability and efficiency.
4+
5+
An $\alpha$-approximation algorithm is such that outputs $S$ in poly time where
6+
7+
- $\frac{\text{cost(S)}}{\text{cost(OPT)}} \le \alpha$ for minimization problems
8+
- $\frac{\text{profit(S)}}{\text{profit(OPT)}} \ge \alpha$ for maximization problems
9+
10+
The goal is to tend towards $\alpha = 1$
11+
12+
## min-weight vertex cover
13+
14+
The LP relaxation gives some result $x^*$. We perform approximation by picking the vertex cover to be $C = \{v \in V : x_v^* \ge \frac{1}{2}\}$.
15+
16+
$C$ is a feasible solution and its weight is at most twice of the optimal one.
17+
18+
## integrality gap
19+
20+
The ratio describing the LP relaxation. Let $\mathcal I$ be the set of all instances of a problem. The for a minimization problem the integrality gap is
21+
22+
$$
23+
g = \max_{I \in \mathcal I}\frac{\text{OPT}(I)}{\text{OPT}_{LP}(I)}
24+
$$
25+
26+
## set cover
27+
28+
Given universe $U = \{e_1, \cdots, e_n\}$ and a family of subset $T = \{S_1, \cdots, S_m\}$ and a cost function $c : T \to \R^+$ find a collection $C \subseteq T$ such that $\bigcup C = U$ and minimizes the cost
29+
30+
### deterministic
31+
32+
Suppose each element belongs to at most $f$ sets. Then the LP solution where we pick the solution to be $C = \{S_i : x_i^* \ge \frac{1}{f}\}$ is an $f$-approximation.
33+
34+
### randomized
35+
36+
We repeat $d \cdot \ln(n)$ times the following: for each $i \in [m]$ add set $S_i$ to the solution $C$ with probability $x^*_i$.
37+
38+
The expected cost after $d \cdot \ln(n)$ repetitions is at most $d \cdot \ln(n) \cdot OPT$.
39+
40+
The output is a feasible solution with probability at least $1 - \frac{1}{n^{d - 1}}$
41+
42+
## prediction with experts advice
43+
44+
Given $N$ experts which individually advise either $0$ or $1$, we predict an answer. Then, the adversary with the knowledge of the advices and our answer, answers. The goal is to minimize the amount of mistakes relative to the best expert. We consider $T$ trials.
45+
46+
### majority vote
47+
48+
If a perfect expert exists, we can always take the majority vote of those experts that have not made a mistake. So we make at most $\log N$ mistakes.
49+
50+
Without a perfect expert we can use the same strategy but restarting whenever we are out of experts. If the best expert has made $M$ mistakes by time $T$, we make at most $(M+1) \log N$ mistakes.
51+
52+
### weighted majority
53+
54+
We take the weighted majority vote of all experts. Weights are initialized to 1, and a mistake is penalized by halving that weight for the given expert. The amount of mistakes we make is
55+
56+
$$
57+
M \le \frac{1}{\log{4 \over 3}}(M_i + \log N)
58+
$$
59+
60+
Where $M_i$ is the amount of mistakes done by expert $i$.
61+
62+
### Hedge
63+
64+
The game is changed to produce a distribution $p$ over the experts while the adversary produces a cost vector $m \in [-1, 1]^N$
65+
66+
Let $\Phi(t) = \sum_{i \in [N]} w_i^{(t)}$ be the sum of all weights at time $t$. Then $p_i^{(t)} = \frac{w_i^{(t)}}{\Phi(t)}$. We update the weights according to $w_i^{(t+1)} = w_i^{(t)} \cdot e^{-\epsilon \cdot m_i^{(t)}}$
67+
68+
For $\epsilon \le 1$ Hedge produces
69+
70+
$$
71+
\sum_{t = 1}^T p^{(t)} \cdot m^{(t)} \le \sum_{t=1}^T m_i^{(t)} + \frac{\ln N}{\epsilon} + \epsilon T
72+
$$
73+
74+
for any expert $i$.
75+
76+
## covering LPs
77+
78+
A covering linear program is such that $A \in \R^{m \times n}_+$, $b \in \R^m_+$, $c \in \R^n_+$.
79+
80+
$$
81+
\begin{align*}
82+
\text{minimize} \quad c^Tx& \\
83+
\text{subject to} \quad Ax &\ge b \\
84+
\quad 1 &\ge x \ge 0
85+
\end{align*}
86+
$$
87+
88+
### Hedge
89+
90+
The number of experts equals to $m$
91+
92+
1. Initialize weights to 1
93+
2. Pick the distribution to be $p_i^{(t)} = \frac{w_i^{(t)}}{\Phi(t)}$
94+
3. Let $x^{(t)}$ be the solution to the reduced LP
95+
4. Let $m_i^{(t)} = A_ix - b_i$
96+
5. Update weights per Hedge
97+
6. Output the solution as $\frac{1}{T}\sum_{t=1}^T x^{(t)}$
98+
99+
The reduced LP being
100+
101+
$$
102+
\begin{align*}
103+
\text{minimize} \quad c^Tx& \\
104+
\text{subject to} \quad (\sum_{i=1}^m p_i A_i) \cdot x &\ge \sum_{i=1}^m p_i b_i \\
105+
\quad 1 &\ge x \ge 0
106+
\end{align*}
107+
$$
108+
109+
The solution is almost feasible while having a cost at most that of the optimal one

masters/algorithms_2/hashing.md

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# hashing
2+
3+
We consider hash functions from hash families $h \in \mathcal H$ which map elements from a universe $U$ into $[N]$.
4+
5+
## 2-universal hash families
6+
7+
A family $\mathcal H$ is 2-universal if for any $x \ne y \in U$
8+
9+
$$
10+
P_{h \in \mathcal H}[h(x) = h(y)] \le \frac{1}{N}
11+
$$
12+
13+
## 2-wise independent
14+
15+
A family $\mathcal H$ is 2-wise independent if for any $x \ne y \in U$ and any $s, t \in [N]$
16+
17+
$$
18+
P_{h \in \mathcal H}[h(x) = s, h(y) = t] = \frac{1}{N^2}
19+
$$

masters/algorithms_2/linear_programing.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ $$
5454
$$
5555
\begin{align*}
5656
\text{maximize} \quad& \sum_{e \in E} w(e) \cdot x_e \\
57-
\text{subject to} \quad& \sum_{e \in \delta(v)} x_e = 1 \qquad&\forall_{e \in E}\\
57+
\text{subject to} \quad& \sum_{e \in \delta(v)} x_e = 1 \qquad&\forall_{v \in V}\\
5858
& x_e \ge 0 \qquad&\forall_{e \in E}
5959
\end{align*}
6060
$$
@@ -140,3 +140,7 @@ x \text{ and } y \text{ are both optimal} \iff \begin{cases}
140140
\forall_j y_j > 0 \implies b_j = \sum_i A_{j,i} x_i \\
141141
\end{cases}
142142
$$
143+
144+
## Hall's theorem
145+
146+
An $n \times n$ bipartite graph $G = (A \cup B, E)$ has a perfect matching iff $|S| \le |N(S)|$ for all $S \subseteq A$.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# randomized algorithms
2+
3+
And algorithm with access to coin-flips
4+
5+
## min-cut
6+
7+
Given a graph $G = (V, E)$ find a non-trivial $S \subseteq V$ such that $|E(S, \bar S)|$ is minimized. Where $E(S, \bar S)$ is the set of edges between $S$ and $\bar S$.
8+
9+
### Karger's min-cut
10+
11+
If the graph has two vertices, output the only possibility. Otherwise,
12+
13+
1. select an edge $e$ uniformly at random
14+
2. contract the vertices $\{u, v\} = e$ to construct a graph $G'$ with $|V| - 1$ vertices
15+
3. recurse on $G'$
16+
17+
This returns a chosen min-cut with probability at least $\frac{1}{\binom{n}{2}}$.
18+
It's runtime is $O(n^4 \cdot \ln n)$.
19+
20+
If we run the algorithm $T = c \cdot n^2 \cdot \ln n$ times and choose the best result we get a fail probability of $\frac{1}{n^c}$.
21+
22+
### Karger-Stain min-cut
23+
24+
If the graph has two vertices, output the only possibility. Otherwise,
25+
26+
1. for $i = n \to n - \frac{n}{\sqrt{2}}$
27+
1. select an edge $e$ uniformly at random and contract it
28+
2. Let $G'$ be the new graph with contracted edges
29+
3. recurse on $G'$ twice, and choose the better result
30+
31+
This returns a min-cut with probability at least $\frac{1}{2\log n}$. It's runtime is $O(n^2 \cdot \log n)$.
32+
33+
## polynomial identity testing (PIT)
34+
35+
Given polynomials $p(x)$ and $q(x)$ we want to know whether they are equal for any $x$. In other words whether $p(x) - q(x) \equiv 0$. We do not know how $p$ or $q$ looks like, we can only evaluate their values.
36+
37+
### Schwartz-Zippel lemma
38+
39+
Let $p(x_1, \cdots, x_n)$ be a non-zero polynomial of degree $d$. Let $S$ be a finite subset of $\R$ with at least $d$ elements. If $x_i \sim U(S)$ then
40+
41+
$$
42+
P[p(x_1, \cdots, x_n) = 0] \le \frac{d}{|S|}
43+
$$
44+
45+
## sampling
46+
47+
Distribution $D$, want to estimate mean of $D$. Draw independent samples from $D$ and return the empirical average. How close should we expect this average to be the real mean and how often?
48+
49+
### markov's inequality
50+
51+
Given a random variable $X \ge 0$, for all $k > 0$
52+
53+
$$
54+
Pr[X \ge k] \le \frac{\mathbb E[X]}{k}
55+
$$
56+
57+
or
58+
59+
$$
60+
Pr[X \ge k \cdot \mathbb E[X]] \le \frac{1}{k}
61+
$$
62+
63+
### chebyshev's inequality
64+
65+
$$
66+
Pr[|X - \mathbb E[X]| > \epsilon] \le \frac{Var[X]}{\epsilon^2}
67+
$$
68+
69+
or
70+
71+
$$
72+
Pr[|X - \mathbb E[X]| > k \cdot \sigma] \le \frac{1}{k^2}
73+
$$
74+
75+
where $\sigma = \sqrt{Var[X]}$
76+
77+
### chernoff bounds
78+
79+
#### bernoulli
80+
81+
Let $X = \sum_{i=1}^n X_i$ where $X_i$ are independent and $X_i = 1$ with probability $p_i$ and $X_i = 0$ otherwise. Let $\mu = \mathbb E[X]$ and $\delta > 0$
82+
83+
$$
84+
P[X \ge (1 + \delta) \mu] \le e^{-\frac{\delta^2}{2 + \delta}\mu}
85+
$$
86+
87+
$$
88+
P[X \le (1 - \delta) \mu] \le e^{-\frac{\delta^2}{2}\mu}
89+
$$
90+
91+
#### bounded
92+
93+
Let $X = \sum_{i=1}^n X_i$ where $X_i$ are independent such that $a \le X_i \le b$. Let $\mu = \mathbb E[X]$ and $\delta > 0$
94+
95+
$$
96+
P[X \ge (1 + \delta) \mu] \le e^{-\frac{2\delta^2\mu^2}{n(b-a)^2}}
97+
$$
98+
99+
$$
100+
P[X \le (1 - \delta) \mu] \le e^{-\frac{\delta^2\mu^2}{n(b-a)^2}}
101+
$$

masters/algorithms_2/streaming.md

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# streaming
2+
3+
The input is a long stream $\sigma = \langle a_1, a_2, \cdots, a_m \rangle$ consisting of $m$ elements where each element takes a value from the universe $[n]$.
4+
5+
The goal is to approximate some value while using a small amount of space.
6+
7+
## misra-gries
8+
9+
We start by initializing an empty associative array A.
10+
11+
For the processing of i:
12+
13+
1. if i $\in$ keys(A) then A[i] += 1
14+
2. else if |keys(A)| < k-1 then A[i] = 1
15+
3. else foreach j $\in$ keys(A)
16+
1. A[j] -= 1
17+
2. if A[j] = 0 then remove j from A
18+
19+
Given an element $a$, the output is $\hat f_a = A[a]$ (or zero).
20+
21+
Given a parameter k, this algorithm uses a single pass and $O(k(\log m + \log n))$ space to return an estimate for any $a$ satisfying
22+
23+
$$
24+
f_a - \frac{m}{k} \le \hat f_a \le f_a
25+
$$

0 commit comments

Comments
 (0)