Name: | Arushi Srinivasan |
---|---|
Email: | asrin708@terpmail.umd.edu |
Office: | CoRE 417 |
Home Institution: | University of Maryland, College Park |
Project: | Error-Correcting Codes: Between Algebraic and Combinatorial |
Mentor: | Dr. Shashank Srivastava |
I'm a rising junior majoring in Computer Science and Math at the University of Maryland, College Park: I'm broadly interested in theory, but some of my favorite topics include randomized algorithms/derandomization, probabilistic methods, and graph algorithms. I've also been recently learning more about complexity in the context of pseudorandom generators and extractors and enjoy theoretical ML as well. Within math, I enjoy combinatorics, probability, real analysis, and algebra and hope to learn a lot more in the coming years. Some of my interests outside of CS + Math include classical piano, hiking, exploring coffee shops (luckily there are several cute ones near Rutgers :D), and playing board games. Excited to be at DIMACS this summer!
I have been going through/reconstructing the proofs in Chapters 4 (standard error bounds of error-correcting codes: Singleton, Plotkin, Gilbert-Varshamov bounds), 5 (Reed-Solomon -- RS -- Codes), 6 (Shannon model of error correction), 7 (Johnson bound), 12 (RS Codes and List/Unique Decoding), and 17 (Folded RS Codes and List Decoding Algorithms) (latter two are still in progress) of Essential Coding Theory by Guruswami, Rudra, and Sudan in order to better-understand Reed-Solomon codes as well as probabilistic error bounds associated with them. Initially, I spent a lot of time studying the Shannon model of coding theory, where every component of the message vector (independently) stays the same with probability $1 - p$ and is changed to any other element $\beta \in F_q$ with probability $\frac{p}{q - 1}$: this model essentially gives us a probability distribution of what the original message is corrupted to over ${F_q}^n$ and has many strong concentrations associated with it. However, as the Shannon model is well-studied, we consider sending RS codes over the adversarial model, where at most $e$ positions can be corrupted for any message. An interesting observation is that the generator matrix for RS codes is the Vandermonde matrix, and many proofs for error-bounds with respect to RS Codes rely on the fact that their generator matrix has full rank and uses some simple but useful properties of irreducible polynomials (e.g. having more distinct roots than the supposed degree of a polynomial forces said polynomial to be 0: this is really useful when we want to show that one polynomial divides another). I also met with Dr. Srivastava and discussed the following paper with him: Omar Alrabiah, Venkatesan Guruswami, and Ray Li. Randomly Punctured Reed Solomon Codes Achieve List-Decoding Capacity over Linear-Sized Fields.
RS codes rely on the evaluation of bounded degree polynomials over finite fields: we select $\alpha_1, \alpha_2, ..., \alpha_n$ in $F_q$. When sending a message (a polynomial $f(x)$ with coefficients in $F_q$ and degree $< k$), we send it as the vector $\textbf{v} = (f(\alpha_1), f(\alpha_2), ..., f(\alpha_n))$ across a channel Ch, which will change $\leq e$ of the components in $\textbf{v}$ to other elements in $F_q$. Given the corrupted vector $\textbf{y} = (y_1, y_2, ..., y_n)$, we now must solve a linear system to recover the coefficients of $f$. Note that we want $n >> k$: otherwise, if two bounded degree polynomials $f, g$ agree on many evaluation points but have higher degree, it is more difficult to recover the coefficients of the original polynomial. Alrabiah, Guruswami, and Li prove that by selecting $\alpha_1, \alpha_2, ..., \alpha_n$ independently and uniformly at random from $F_q$, with high probability, good list-decoding properties can be achieved (see week 2 for more info on this). One potential research direction that we have been exploring is derandomizing parts of the AGL algorithm: this essentially means that we can achieve exact decoding properties by making their randomized algorithm deterministic. I also spent time making my presentation for my project proposal.
I finished Chapters 12, 17 of Essential Coding Theory, and was able to reconstruct the proofs of correctness for their unique decoding and list-decoding algorithms. The problem of $\rho$ unique decoding is the following: can we construct a linear code C (a subspace over ${F_q}^n$) s.t. $\forall y \in {F_q}^n$, $|\{c \in C \mid \Delta(c, y) \leq \rho n\}| \leq 1$. It has been proven that RS codes are optimal for the unique decoding problem (I reconstructed this proof from Essential Coding Theory and also reviewed it with Dr. Srivastava). We can also generalize a lot of ideas from this proof to prove list-decodability of RS codes: $(\rho, L)$ list decoding is as follows -- can we construct a linear code C (a subspace over ${F_q}^n$) s.t. $\forall y \in {F_q}^n$, $|\{c \in C \mid \Delta(c, y) \leq \rho n\}| \leq L$ (note that unique decoding sets $L = 1$)? Currently, we have algorithms that can list-decode (i.e. output the said list of codewords within $\rho n$ hamming distance) up to a $1 - \sqrt{R}$ error fraction ($R = \frac{dim(C)}{n} = \frac{\log_q |C|}{n})$, rate of the code) but it is an open question as to whether or not we can do better for RS codes. Moreover, Chapter 17 deals with folded RS codes: instead of sending a vector $(f(\alpha_1), ..., f(\alpha_n))$ over the channel, we send a vector of the form $((f(\alpha_1), f(\alpha_2), ..., f(\alpha_m)), (f(\alpha_{m+1}), f(\alpha_{m+2}), ..., f(\alpha_{2m})), ..., (f(\alpha_{n - m + 1}), ..., f(\alpha_n)))$. The idea of this is that the rate and distance of the code are preserved (we also consider m to be some fixed parameter), and the fraction of error is also preserved (e.g. say $\frac{e}{n}$ of the components of the original vector have errors: then $\frac{e}{n}$ of the blocks in the folded RS code also have errors). However, the alphabet size significantly increases: e.g. say there are $m = 2$ components in each block, then the alphabet size of the code increases to $|\Sigma|^2$ (where $\Sigma$ is the alphabet for the original code). It's also worth mentioning the utility of list decoding as opposed to unique decoding: when the fraction of errors is high, we want multiple codewords that are near a query string (vector) so that we can prune that list accordingly and estimate the query vector from the decoding side. Moreover, very often, these list sizes aren't particularly large (usually some fixed constant) so pruning isn't too hard: it's more worthwhile to actually consider how to make codes that are list-decodable.
This all leads us to some interesting questions: very often, it is easier to prove properties of Codes when we have a larger alphabet size. Methods like folding or interleaving end up taking an existing alphabet/Code and creating codes with a larger alphabet size: however, when we can prove properties related to linearity, rate, distance, and list-decodability for these newly constructed codes, it gives us more information about how good our original code is. One question/research direction that Dr. Srivastava posed is as follows (definition of interleaving is below): if we construct a linear code C over $F_q$ and interleave it (suffices to consider when each component of a vector in the new code is an element of $\Sigma^2$, as this generalizes to higher dimensions) depending on how we define a bijection from $F_{q^2} \rightarrow {F_q}^2$, can we get linearity over $F_{q^2}$ (this is more related to scalar multiplication)? A recent paper by Brakensiek, Chen, Dhar, and Zhang (BCDZ) shows that we end up with linearity over $F_{q^2}$ for RS Codes, but we will look into seeing whether this can generalize to other linear codes/maximally distance separated codes. Interleaving is as follows: given $f = (\alpha_1, \alpha_2, ..., \alpha_n), g = (\beta_1, \beta_2, ..., \beta_n) \in C$, we can interleave them to create $h = ((\alpha_1, \beta_1), ..., (\alpha_n, \beta_n))$.
We tried to investigate interleaved RS codes with respect to a lot of the properties AGL states for random linear codes: more results are known for folded RS codes than interleaved RS codes, and it has led to the following question -- outside of simply the alphabet size, why is it easier to obtain list decodability/certain linear properties with folded RS codes or interleaved RS codes (which by definition can't be expressed as evaluations of a polynomial over a finite field)? I also reconstructed all of the proofs from BCDZ with respect to the equivalence between RS codes over a subfield and interleaved RS codes. We also tried to explore the following researh direction: clearly, when you m-interleave RS codes over $F_q$, due to the linear bijection between ${F_q}^m$ and ${F_{q^m}}$ (where $(a_0,...,a_{m - 1})$ maps to $\sum_{i=0}^{m - 1} {a_i}{\gamma^i}$, where $\gamma$ is a primitive element of $F_{q^m}$), the rate and hamming distance are preserved, but the alphabet size is increased. What happens when you try to deconstruct this type of code -- when you want to from $f$ in $F_{q^m}[X]$ to an m-tuple in $F_{q'}[X]$, how large can $q'$ possibly be/what properties do we want? Another question is in the other direction, when you see if you can interleave codes with different evaluation points but still in the same subfield. This is still a work in progress -- also, given that our goal is to derandomize AGL, we need to see what properties we actually want in the finite subfield we select points from so that we can make their algorithm deterministic.
I also spent a lot of time reading about concatenated RS codes and met with Dr. Srivastava to revise the proofs of several error bounds that are currently known on them: we also understood how expander-based constructions allow for a positive rate, distance (relative hamming distance: normalized) for an infinite family of codes with 0 dependence on the blocklength (ideally, you want both rate and distance to be large, but they're negatively correlated and constrained by the Singleton bound: $\delta + \rho \leq 1 + \frac{1}{n}$, where n is the blocklength). Lastly, in order to better understand expander-based constructions, I've been going through Professor Salil Vadhan's monograph on pseudorandomness (chapter 2 is on random walks, chapter 4 is on expander graphs). We hope that expander-based codes will significantly help us in the derandomization process.
We completely pivoted our line of research toward combinatorial error correction (as opposed to algebraic error correction): derandomizing AGL is a very interesting problem, but we realized that it may be too complex for a 9 week REU. Thus, we still focused on pseudorandomness via AEL (Alon-Edmonds-Luby) concatenated codes and spectral expander graphs in order to find another interesting problem that perhaps is more approachable in nature. Concatenated codes are as follows: we are given an outer-code (and a corresponding decoder) $C_{out} \subset \Sigma^n$, and an inner-code $C_{in} \subset {\Sigma_0}^d$. We also have $\Phi: \Sigma \rightarrow {\Sigma_0}^d$, an injective mapping over $\Sigma$. We can construct $C_{concat}$, which takes a vector in $\Sigma^n$ and applies $\Phi$ to each of its components, resulting in a new string that is now of length $nd$ as opposed to just $n$. Code concatenation is a very useful concept: when measuring the robustness of a code, we want it to have strictly positive rate and distance. I was able to reconstruct the proofs for why multiple existing families of codes, as $n \rightarrow \infty$ has a rate or distance that asymptotically tends to $0$ (by the Singleton bound, $\rho + \delta \leq 1 + \frac{1}{n}$ so if one of rate/distance tends to 0 then the other one tends to 1 -- also consider rate and distance normalized) and why as opposed to this, concatenated codes have a rate $\rho_{out} * \rho_{in}$ and a distance $\geq \delta_{out} * \delta_{in}$: both strictly positive and independent of blocklength.
AEL (Alon-Edmonds-Luby '96) constructions take this even further: now, consider $d$, $|\Sigma_0|$ to be constants. Say we have a bipartite spectral expander graph (Let $ G = (L \cup R, E)$ be a d-regular bipartite graph, where $L$ and $R$ are the two bipartition sets and $E$ is the set of edges. Assume $|L| = |R| = n$. For a parameter $\lambda \in (0, 1)$, the $G$ is said to be a $\lambda$-spectral expander if $\forall X \subseteq L$, $Y \subseteq R$, $|E(X, Y) - \frac{d}{n} * |X| * |Y|| \leq \lambda dn$). Given a d-regular $\lambda$ bipartite expander (we can choose this as needed, because all these expander graphs have explicit constructions -- making them perfect for derandomization while still providing strong combinatorial/probabilistic properties, as we'll see in the expander mixing lemma), we can think of the outer code as a labeling of $L$ (string in $\Sigma^n$), the inner code as a mapping from each vertex to a labeling of its incident edges (inner code elements are in ${\Sigma_0}^d$), and the final AEL codewords as strings in ${\Sigma_0}^{dn}$, except that our last step is "folding" these edge labels on each right vertex (giving us $n$ strings in ${\Sigma_0}^d$) and returning the final concatenated code. The rate is just the product of the outer and inner rates, but using the expander mixing lemma, we get a really nice property for distance: $\delta_{AEL} > \delta_{in} - \frac{\lambda}{\delta_{out}}$, which is clearly better than $\delta_{AEL} \geq \delta_{out}*\delta_{in}$ (we can also just set $\lambda$ to be very small in terms of $\delta_{out}$: it's not difficult to prove that if $\lambda = 0$ then $d = n$, but we can set $\lambda > 0$ still small by thinking of it as a fixed parameter, meaning that the alphabet size of our inner code doesn't blow up).
I also reconstructed the proofs for why AEL is optimal for unique decoding, as well as the distance-amplification argument for AEL/bounds on the number of edges between error-inducing subsets of $L$ and the subset of $R$ that have incident edges to these vertices in $L$, following from the expander mixing lemma. In JMST '25 (insert link), they were able to generalize the unique decoding proof of AEL codes to average-radius list-decodability, making AEL codes more robust. One research direction we started exploring was modifying the induction step/partition enumerations in this paper to get tighter bounds on average-radius list-decodability, but during weeks 6-7, started solving a different problem that was more interesting to us.