It’s been almost a month since my last blog post and my excuse is that I’ve been quite busy! So as today marks the end of my first PhD research rotation with Dr. Netanel Raviv, this blog post will consists of my work during the rotation and some thoughts regarding it.
As I explained in my last blog post, my rotation with Dr. Raviv dealt with exploring whether there exists a better means of coding a perceptron with improved minimum distance and relative distance than a parity code. This can be formulated equivalently by proving the existence of a -polarizing code by picking a generator matrix at random and analyzing the resulting code is indeed
-polarizing which is defined as follows:
where is the size of the information word,
is the size of the coded word, and
determines how strictly heavy or light we want the resulting coded word to be.
Our method of approaching this proof consisted of picking the generator matrix at random, computing the probability that one -weight information word is mapped correctly, and then understand the probability that all information words in
are mapped correctly using the Union Bound.
The generator matrix is a matrix where each entry is a Bernoulli random variable with probability
that
and probability
that
. An information word
is a
vector where each entry is also binary
. Each information word has a weight
which is the Hamming weight of the word.
My initial attempt at this proof was an utter failure as I failed to take into consideration that we are working in which is a Galois Field of dimension 2. This meant that under this assumption of a binary field, arithmetic uses binary operations: addition of elements is actually done using the exlcusive or (XOR) and multiplication using AND. Binary multiplication in
remains the same but the main problem was with binary addition.
0 | 1 | |
0 | 0 | 1 |
1 | 1 | 0 |
0 | 1 | |
0 | 0 | 0 |
0 | 0 | 1 |
Fixing this stupid error on my part, I found the probability that the $\latex i^{th}$ entry evaluates to 1. Since the entry of the coded word is
where . Thus the probability of a 1 at the
entry is
We will define this as such that the coded word
is a
matrix of
‘s. Now, to find the probability of the coded word being strictly light is when the Hamming weight of the coded word is less than
which can be written as
where is a success with probability
. The probability of the coded word being strictly heavy is similar with its appropriate bounds on the sum.
The “good” event for one information word is that
is mapped correctly. Its complement (the “bad” event) is one where
is mapped incorrectly. We will denote these events by
, respectively. The “good” event consists of all information words in
being mapped correctly:
where we want to show since a non-zero probability implies that such an encoding is possible with a randomly chosen generator matrix (code). However, since it is difficult to compute the probability of the intersection of these “good” events, we will instead compute its complement
and show that . Here, applying the Union Bound, we can alternatively show that
Expanding out , we get
I wrote a MATLAB script to test different values for the parameters but the results showed that we in fact do not get any probabilities below 1 for
. This is not a good sign and adjusting the threshold for light/heavy categorization did not change this result either. Thus, my conclusion is that either the Union Bound is too weak or in fact a
-polarizing code does not exist for
.
I would have liked to continue investigating this disappointing result but at this time, my rotation with Dr. Raviv has come to an end and I must prepare for my next rotation with Dr. Neal Patwari in the ESE department. Regarding my thoughts on this rotation, it was my first time dealing with such theory-heavy research but I think I came to like this sort of work. It’s definitely a change from more practical and/or application-wise research and it relies heavily on math and probability theory (especially moreso as this is ML research). At this point, I’m not entirely sure if I would enjoy the idea of pursuing a theoretical research path for my PhD as it sort of limits the opportunities in industry after graduation. And at the moment, personally, I swing a bit more towards working in industry than staying in academia post-grad. Well, my next rotation with Dr. Patwari will be on the completely opposite spectrum where I will likely be developing a ML application so at the end of my second rotation I will have a better sense of the kind of research I enjoy. 🙂