How am I supposed to pass this subject?

shilangyu · shilangyu · commit 1587a3ccdc04 · 2025-01-26T13:41:08.000+01:00
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -50,6 +50,7 @@
     "CGHA",
     "cheatsheet",
     "chebyshev's",
+    "chernoff",
     "Ciphertext",
     "ciphertexts",
     "Civita",
@@ -107,6 +108,7 @@
     "GNFS",
     "grayscale",
     "Greibach",
+    "gries",
     "Grzegorz",
     "halftoning",
     "hasse",
@@ -185,6 +187,7 @@
     "Michal",
     "Michał",
     "microkernel",
+    "misra",
     "mkaleta",
     "Mobius",
     "monic",
diff --git a/masters/algorithms_2/approximation_algorithms.md b/masters/algorithms_2/approximation_algorithms.md
@@ -0,0 +1,109 @@
+# approximation algorithms
+
+Algorithms which relax the requirement of finding the optimal solution while preserving reliability and efficiency.
+
+An $\alpha$-approximation algorithm is such that outputs $S$ in poly time where
+
+- $\frac{\text{cost(S)}}{\text{cost(OPT)}} \le \alpha$ for minimization problems
+- $\frac{\text{profit(S)}}{\text{profit(OPT)}} \ge \alpha$ for maximization problems
+
+The goal is to tend towards $\alpha = 1$
+
+## min-weight vertex cover
+
+The LP relaxation gives some result $x^*$. We perform approximation by picking the vertex cover to be $C = \{v \in V : x_v^* \ge \frac{1}{2}\}$.
+
+$C$ is a feasible solution and its weight is at most twice of the optimal one.
+
+## integrality gap
+
+The ratio describing the LP relaxation. Let $\mathcal I$ be the set of all instances of a problem. The for a minimization problem the integrality gap is
+
+$$
+g = \max_{I \in \mathcal I}\frac{\text{OPT}(I)}{\text{OPT}_{LP}(I)}
+$$
+
+## set cover
+
+Given universe $U = \{e_1, \cdots, e_n\}$ and a family of subset $T = \{S_1, \cdots, S_m\}$ and a cost function $c : T \to \R^+$ find a collection $C \subseteq T$ such that $\bigcup C = U$ and minimizes the cost
+
+### deterministic
+
+Suppose each element belongs to at most $f$ sets. Then the LP solution where we pick the solution to be $C = \{S_i : x_i^* \ge \frac{1}{f}\}$ is an $f$-approximation.
+
+### randomized
+
+We repeat $d \cdot \ln(n)$ times the following: for each $i \in [m]$ add set $S_i$ to the solution $C$ with probability $x^*_i$.
+
+The expected cost after $d \cdot \ln(n)$ repetitions is at most $d \cdot \ln(n) \cdot OPT$.
+
+The output is a feasible solution with probability at least $1 - \frac{1}{n^{d - 1}}$
+
+## prediction with experts advice
+
+Given $N$ experts which individually advise either $0$ or $1$, we predict an answer. Then, the adversary with the knowledge of the advices and our answer, answers. The goal is to minimize the amount of mistakes relative to the best expert. We consider $T$ trials.
+
+### majority vote
+
+If a perfect expert exists, we can always take the majority vote of those experts that have not made a mistake. So we make at most $\log N$ mistakes.
+
+Without a perfect expert we can use the same strategy but restarting whenever we are out of experts. If the best expert has made $M$ mistakes by time $T$, we make at most $(M+1) \log N$ mistakes.
+
+### weighted majority
+
+We take the weighted majority vote of all experts. Weights are initialized to 1, and a mistake is penalized by halving that weight for the given expert. The amount of mistakes we make is
+
+$$
+M \le \frac{1}{\log{4 \over 3}}(M_i + \log N)
+$$
+
+Where $M_i$ is the amount of mistakes done by expert $i$.
+
+### Hedge
+
+The game is changed to produce a distribution $p$ over the experts while the adversary produces a cost vector $m \in [-1, 1]^N$
+
+Let $\Phi(t) = \sum_{i \in [N]} w_i^{(t)}$ be the sum of all weights at time $t$. Then $p_i^{(t)} = \frac{w_i^{(t)}}{\Phi(t)}$. We update the weights according to $w_i^{(t+1)} = w_i^{(t)} \cdot e^{-\epsilon \cdot m_i^{(t)}}$
+
+For $\epsilon \le 1$ Hedge produces
+
+$$
+\sum_{t = 1}^T p^{(t)} \cdot m^{(t)} \le \sum_{t=1}^T m_i^{(t)} + \frac{\ln N}{\epsilon} + \epsilon T
+$$
+
+for any expert $i$.
+
+## covering LPs
+
+A covering linear program is such that $A \in \R^{m \times n}_+$, $b \in \R^m_+$, $c \in \R^n_+$.
+
+$$
+\begin{align*}
+	\text{minimize} \quad c^Tx& \\
+	\text{subject to} \quad Ax &\ge b \\
+	\quad 1 &\ge x \ge 0
+\end{align*}
+$$
+
+### Hedge
+
+The number of experts equals to $m$
+
+1. Initialize weights to 1
+2. Pick the distribution to be $p_i^{(t)} = \frac{w_i^{(t)}}{\Phi(t)}$
+3. Let $x^{(t)}$ be the solution to the reduced LP
+4. Let $m_i^{(t)} = A_ix - b_i$
+5. Update weights per Hedge
+6. Output the solution as $\frac{1}{T}\sum_{t=1}^T x^{(t)}$
+
+The reduced LP being
+
+$$
+\begin{align*}
+	\text{minimize} \quad c^Tx& \\
+	\text{subject to} \quad (\sum_{i=1}^m p_i A_i) \cdot x &\ge \sum_{i=1}^m p_i b_i \\
+	\quad 1 &\ge x \ge 0
+\end{align*}
+$$
+
+The solution is almost feasible while having a cost at most that of the optimal one
diff --git a/masters/algorithms_2/hashing.md b/masters/algorithms_2/hashing.md
@@ -0,0 +1,19 @@
+# hashing
+
+We consider hash functions from hash families $h \in \mathcal H$ which map elements from a universe $U$ into $[N]$.
+
+## 2-universal hash families
+
+A family $\mathcal H$ is 2-universal if for any $x \ne y \in U$
+
+$$
+P_{h \in \mathcal H}[h(x) = h(y)] \le \frac{1}{N}
+$$
+
+## 2-wise independent
+
+A family $\mathcal H$ is 2-wise independent if for any $x \ne y \in U$ and any $s, t \in [N]$
+
+$$
+P_{h \in \mathcal H}[h(x) = s, h(y) = t] = \frac{1}{N^2}
+$$
diff --git a/masters/algorithms_2/linear_programing.md b/masters/algorithms_2/linear_programing.md
@@ -54,7 +54,7 @@ $$
 $$
 \begin{align*}
 	\text{maximize} \quad& \sum_{e \in E} w(e) \cdot x_e \\
-	\text{subject to} \quad& \sum_{e \in \delta(v)} x_e = 1 \qquad&\forall_{e \in E}\\
+	\text{subject to} \quad& \sum_{e \in \delta(v)} x_e = 1 \qquad&\forall_{v \in V}\\
 	& x_e \ge 0 \qquad&\forall_{e \in E}
 \end{align*}
 $$
@@ -140,3 +140,7 @@ x \text{ and } y \text{ are both optimal} \iff \begin{cases}
 	\forall_j y_j > 0 \implies b_j = \sum_i A_{j,i} x_i \\
 \end{cases}
 $$
+
+## Hall's theorem
+
+An $n \times n$ bipartite graph $G = (A \cup B, E)$ has a perfect matching iff $|S| \le |N(S)|$ for all $S \subseteq A$.
diff --git a/masters/algorithms_2/randomized_algorithms.md b/masters/algorithms_2/randomized_algorithms.md
@@ -0,0 +1,101 @@
+# randomized algorithms
+
+And algorithm with access to coin-flips
+
+## min-cut
+
+Given a graph $G = (V, E)$ find a non-trivial $S \subseteq V$ such that $|E(S, \bar S)|$ is minimized. Where $E(S, \bar S)$ is the set of edges between $S$ and $\bar S$.
+
+### Karger's min-cut
+
+If the graph has two vertices, output the only possibility. Otherwise,
+
+1. select an edge $e$ uniformly at random
+2. contract the vertices $\{u, v\} = e$ to construct a graph $G'$ with $|V| - 1$ vertices
+3. recurse on $G'$
+
+This returns a chosen min-cut with probability at least $\frac{1}{\binom{n}{2}}$.
+It's runtime is $O(n^4 \cdot \ln n)$.
+
+If we run the algorithm $T = c \cdot n^2 \cdot \ln n$ times and choose the best result we get a fail probability of $\frac{1}{n^c}$.
+
+### Karger-Stain min-cut
+
+If the graph has two vertices, output the only possibility. Otherwise,
+
+1. for $i = n \to n - \frac{n}{\sqrt{2}}$
+   1. select an edge $e$ uniformly at random and contract it
+2. Let $G'$ be the new graph with contracted edges
+3. recurse on $G'$ twice, and choose the better result
+
+This returns a min-cut with probability at least $\frac{1}{2\log n}$. It's runtime is $O(n^2 \cdot \log n)$.
+
+## polynomial identity testing (PIT)
+
+Given polynomials $p(x)$ and $q(x)$ we want to know whether they are equal for any $x$. In other words whether $p(x) - q(x) \equiv 0$. We do not know how $p$ or $q$ looks like, we can only evaluate their values.
+
+### Schwartz-Zippel lemma
+
+Let $p(x_1, \cdots, x_n)$ be a non-zero polynomial of degree $d$. Let $S$ be a finite subset of $\R$ with at least $d$ elements. If $x_i \sim U(S)$ then
+
+$$
+P[p(x_1, \cdots, x_n) = 0] \le \frac{d}{|S|}
+$$
+
+## sampling
+
+Distribution $D$, want to estimate mean of $D$. Draw independent samples from $D$ and return the empirical average. How close should we expect this average to be the real mean and how often?
+
+### markov's inequality
+
+Given a random variable $X \ge 0$, for all $k > 0$
+
+$$
+Pr[X \ge k] \le \frac{\mathbb E[X]}{k}
+$$
+
+or
+
+$$
+Pr[X \ge k \cdot \mathbb E[X]] \le \frac{1}{k}
+$$
+
+### chebyshev's inequality
+
+$$
+Pr[|X - \mathbb E[X]| > \epsilon] \le \frac{Var[X]}{\epsilon^2}
+$$
+
+or
+
+$$
+Pr[|X - \mathbb E[X]| > k \cdot \sigma] \le \frac{1}{k^2}
+$$
+
+where $\sigma = \sqrt{Var[X]}$
+
+### chernoff bounds
+
+#### bernoulli
+
+Let $X = \sum_{i=1}^n X_i$ where $X_i$ are independent and $X_i = 1$ with probability $p_i$ and $X_i = 0$ otherwise. Let $\mu = \mathbb E[X]$ and $\delta > 0$
+
+$$
+P[X \ge (1 + \delta) \mu] \le e^{-\frac{\delta^2}{2 + \delta}\mu}
+$$
+
+$$
+P[X \le (1 - \delta) \mu] \le e^{-\frac{\delta^2}{2}\mu}
+$$
+
+#### bounded
+
+Let $X = \sum_{i=1}^n X_i$ where $X_i$ are independent such that $a \le X_i \le b$. Let $\mu = \mathbb E[X]$ and $\delta > 0$
+
+$$
+P[X \ge (1 + \delta) \mu] \le e^{-\frac{2\delta^2\mu^2}{n(b-a)^2}}
+$$
+
+$$
+P[X \le (1 - \delta) \mu] \le e^{-\frac{\delta^2\mu^2}{n(b-a)^2}}
+$$
diff --git a/masters/algorithms_2/streaming.md b/masters/algorithms_2/streaming.md
@@ -0,0 +1,25 @@
+# streaming
+
+The input is a long stream $\sigma = \langle a_1, a_2, \cdots, a_m \rangle$ consisting of $m$ elements where each element takes a value from the universe $[n]$.
+
+The goal is to approximate some value while using a small amount of space.
+
+## misra-gries
+
+We start by initializing an empty associative array A.
+
+For the processing of i:
+
+1. if i $\in$ keys(A) then A[i] += 1
+2. else if |keys(A)| < k-1 then A[i] = 1
+3. else foreach j $\in$ keys(A)
+   1. A[j] -= 1
+   2. if A[j] = 0 then remove j from A
+
+Given an element $a$, the output is $\hat f_a = A[a]$ (or zero).
+
+Given a parameter k, this algorithm uses a single pass and $O(k(\log m + \log n))$ space to return an estimate for any $a$ satisfying
+
+$$
+f_a - \frac{m}{k} \le \hat f_a \le f_a
+$$