In the previous article, I talked about two nice things: modular arithmetic and Galois fields. I introduced them in anticipation of this day. Yes, because today I will talk about a cryptographic algorithm. An algorithm that is not only really cool, but also has an emotional value, since I first encountered it some time ago while discussing it with a colleague whom I greet here (hi Dylan) and who is no longer here. Don’t worry he’s not dead, he has “only”, to my great displeasure, changed company. The algorithm I will discuss is Shamir's Secret Sharing (abbreviated SSS, luckily there are three words otherwise try explaining it to readers), devised by Adi Shamir, who is a cryptography guru (the "S" in the RSA algorithm comes from him). Before starting, I strongly recommend, if you haven’t done so already, that you read here, the article on finite fields.

In Addition

I have already explained modular arithmetic and Galois fields, so my conscience is clear on that front. However, there is still one last tool missing in order to explain the SSS algorithm, and since I am a perfectionist, far be it from me to deprive you of everything you need. What we need now is Lagrange linear interpolation. Two notes on this:

  1. It bears his name, but he did not invent the method.
  2. He is a great ITALIAN mathematician and not French.

Let us resume. I know this is probably not your first problem in the morning, but what would you do if you wanted to determine a polynomial of degree \(n\) that passes through \(n+1\) given points:

$$ (x_0, \, y_0), \, (x_1, \, y_1), \, \dots, \, (x_n, \, y_n) $$

The problem may seem extremely complex, but in reality it is nothing abstruse. Let us look at a first solution method.

The "Naive" Addition

I do not want to start at full speed right away, so let us warm up first, otherwise we might get hurt, and look at a simple method. The goal of the problem is to find a polynomial of degree \(n\), right? So let us start by taking a generic one:

$$ y = a_0 + a_1x + a_2x^2 + \dots + a_nx^n $$

Formula 1. Generic Polynomial

Where \(x\) and \(y\) are generic coordinates, and \(a_i\) is a generic coefficient. What do we want from this polynomial? We want it to pass through \(n+1\) points. Let us start from the first point:

  • \((x_0, \, y_0)\): From this point we know the values of \(x\) and \(y\), which are precisely \(x_0\) and \(y_0\). Let us plug them into our formula.

    $$ y_0 = a_0 + a_1x_0 + a_2x_0^2 + \dots + a_nx_0^n $$

    What we want to do is compute the coefficients \(a_i\), and is this single formula enough to solve the problem? No. Because the \(a_i\) we are looking for are \(n+1\), and here we only have one equation. If there is one thing we learned in middle school by solving systems, it is that if we have \(m\) unknowns, we need \(m\) equations to determine them all. We have \(n+1\) \(a\)'s, but we also have \(n+1\) points, right? So we can write a second equality...

  • \((x_1, \, y_1)\): From this point we know the values of \(x\) and \(y\), which are precisely \(x_1\) and \(y_1\). Let us plug them into our formula.

    $$ y_1 = a_0 + a_1x_1 + a_2x_1^2 + \dots + a_nx_1^n $$

    I would say the pattern is quite clear. So let us skip straight to the last point.

  • \(...\)

  • \((x_n, \, y_n)\): From this point we know the values of \(x\) and \(y\), which are precisely \(x_n\) and \(y_n\). As usual, let us plug them into our formula.

    $$ y_n = a_0 + a_1x_n + a_2x_n^2 + \dots + a_nx_n^n $$

    At this point it is clear that we have \(n+1\) equations, and therefore we can be happy, since we can now compute the \(n+1\) unknowns \(a\).

What we obtained is a system of equations, that is, that set of equations which, once the unknown values are substituted, satisfy all the given equality conditions. Let us write it in a more mathematical form:

$$ \begin{cases} y_0 = a_0 + a_1 x_0 + a_2 x_0^2 + \dots + a_n x_0^n \\ y_1 = a_0 + a_1 x_1 + a_2 x_1^2 + \dots + a_n x_1^n \\ \vdots \\ y_n = a_0 + a_1 x_n + a_2 x_n^2 + \dots + a_n x_n^n \end{cases} $$

two paths are possible at this point: either we do it by hand, or we get clever and write the linear system in matrix form:

$$ V \cdot \vec{a} = \vec{y} $$

where:

  • \(V\) is a matrix called the Vandermonde matrix, and it is defined as follows:

    $$ V = \begin{bmatrix} 1 & x_0 & x_0^2 & \dots & x_0^n \\ 1 & x_1 & x_1^2 & \dots & x_1^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & x_n^2 & \dots & x_n^n \end{bmatrix} $$

    I remind you that we already have all the elements of this matrix: they are the \(x\) coordinates (and their powers) through which we want the polynomial to pass.

  • \(\vec{a}\) is the vector of unknowns, that is, the polynomial coefficients we are looking for.

    $$ \vec{a} = \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{bmatrix} $$

  • \(\vec{y}\) is the vector of known terms, we know these as well since they come from the \(y\) coordinates through which we want the polynomial to pass.

    $$ \vec{y} = \begin{bmatrix} y_0 \\ y_1 \\ \vdots \\ y_n \end{bmatrix} $$

By rewriting the system in full we have:

$$ \begin{bmatrix} 1 & x_0 & x_0^2 & \dots & x_0^n \\ 1 & x_1 & x_1^2 & \dots & x_1^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & x_n^2 & \dots & x_n^n \end{bmatrix} \cdot \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{bmatrix} = \begin{bmatrix} y_0 \\ y_1 \\ \vdots \\ y_n \end{bmatrix} $$

Formula 2. Matrix Form of Interpolation

Good, at this point there are two ways to solve the problem.

  1. The one that pisses off my professor of numerical calculus and complex analysis, which involves inverting \(V\):

    $$ \vec{a} = V^{-1} \cdot \vec{b} $$

  2. A smarter method, such as the Gauss method, LU factorization, and so on.

Now we will not look at either of them, because that would mean explaining what a determinant is, what the inverse of a matrix is, and what happens when it is not invertible (to put it very briefly: either there are no solutions or there are infinitely many), why it is not advisable to invert a matrix and to do a decomposition instead. Then I would also have to explain the decompositions. Too much stuff. Let us consider a small example to solve by hand, just to help you understand what we obtain.

Example

Let us assume we want to compute a polynomial of degree \(2\) that passes through the points:

$$ (x_0, \, y_0) = (0, \, 1) \\[6pt] (x_1, \, y_1) = (1, \, 2) \\[6pt] (x_2, \, y_2) = (2, \, 5) $$

The polynomial will have the form:

$$ y = a_0 + a_1 x + a_2 x^2 $$

Let us write our system of three equations, simply by substituting the points as shown previously:

$$ \begin{cases} 1 = a_0 + a_1 \cdot (0) + a_2 \cdot (0)^2 \\ 2 = a_0 + a_1 \cdot (1) + a_2 \cdot (1)^2 \\ 5 = a_0 + a_1 \cdot (2) + a_2 \cdot (2)^2 \end{cases} $$

from which:

$$ \begin{aligned} \begin{cases} a_0 = 1 \\ 1 = a_1 + a_2 \\ 4 = 2a_1 + 4a_2 \end{cases} \\[6pt] \Rightarrow a_1 = 0, \, a_2 = 1 \end{aligned} $$

We have found all the coefficients \(a_0, \, a_1, \, a_2\), so we can write our polynomial, which will be:

$$ \begin{aligned} y &= a_0 + a_1 x + a_2 x^2 \\[6pt] \Rightarrow y &= 1 + x^2 \end{aligned} $$

Let us look at its graphical behavior:

plot Figure 1. Plot of the Polynomial \(y = 1 + x^2\)

Look carefully at the graph to see where the red crosses are placed. It passes exactly through the required points.

The Lagrange Addition

Now that you know what we are trying to do, throw away the entire previous method. It was certainly useful because it allowed us to understand, in a fairly intuitive way, what the point was. But since there is another way to do it, and since you are dying to learn it, who am I to deprive you of it? The time has come for Lagrange interpolation, so let us start by defining our polynomial \(y\). This time we define it through a summation, in the following way:

$$ y = \sum_{i=0}^{n} y_i \cdot L_i(x) $$

Formula 3. Lagrange Interpolation

where the generic \(L_i(x)\) is a product defined as follows:

$$ L_i(x) = \prod_{j=0, \, j \neq i}^{n} \frac{x-x_j}{x_i-x_j} $$

Expanding it explicitly, we have that a generic polynomial has the form:

$$ \begin{aligned} y &= y_0 \cdot \frac{(x - x_1)(x - x_2)\cdots(x - x_n)} {(x_0 - x_1)(x_0 - x_2)\cdots(x_0 - x_n)} \;+\; \\[6pt] &\quad+\; y_1 \cdot \frac{(x - x_0)(x - x_2)\cdots(x - x_n)} {(x_1 - x_0)(x_1 - x_2)\cdots(x_1 - x_n)} \;+\; \\[6pt] &\quad+\; \cdots \;+\; \\[6pt] &\quad+\; y_n \cdot \frac{(x - x_0)(x - x_1)\cdots(x - x_{n-1})} {(x_n - x_0)(x_n - x_1)\cdots(x_n - x_{n-1})} \end{aligned} $$

Here as well, it may look like something extremely complex, but in reality it is very simple. Let us see it with an example.

Example

Let us consider the same example as before and look for the polynomial that passes through the points:

$$ (x_0, \, y_0) = (0, \, 1) \\[6pt] (x_1, \, y_1) = (1, \, 2) \\[6pt] (x_2, \, y_2) = (2, \, 5) $$

Let us compute the required \(L_i\):

  • \(L_0\) uses the point \((x_0, \, y_0) = (0, \, 1)\):

    $$ \begin{aligned} L_0(x) &= \frac{x - x_1}{x_0 - x_1} \cdot \frac{x - x_2}{x_0 - x_2} \\ &= \frac{x - 1}{0 - 1} \cdot \frac{x - 2}{0 - 2} \\ &= \frac{x - 1}{-1} \cdot \frac{x - 2}{-2} \\[6pt] &= \frac{(x - 1)(x - 2)}{2} \end{aligned} $$

  • \(L_1\) uses the point \((x_1, \, y_1) = (1, \, 2)\):

    $$ \begin{aligned} L_1(x) &= \frac{x - x_0}{x_1 - x_0} \cdot \frac{x - x_2}{x_1 - x_2} \\ &= \frac{x - 0}{1 - 0} \cdot \frac{x - 2}{1 - 2} \\ &= x \cdot \frac{x - 2}{-1} \\[6pt] &= -x(x - 2) \end{aligned} $$

  • \(L_2\) uses the point \((x_2, \, y_2) = (2, \, 5)\):

    $$ \begin{aligned} L_2(x) &= \frac{x - x_0}{x_2 - x_0} \cdot \frac{x - x_1}{x_2 - x_1}\\ &= \frac{x - 0}{2 - 0} \cdot \frac{x - 1}{2 - 1}\\ &= \frac{x}{2}(x - 1)\\[6pt] &= \frac{x(x - 1)}{2} \end{aligned} $$

Note that we are not using the \(y\) values yet. They come into play exactly here:

$$ y = y_0 \cdot L_0(x) + y_1 \cdot L_1(x) + y_2 \cdot L_2(x) \\[10pt] \Rightarrow \; y = 1 \cdot \frac{(x - 1)(x - 2)}{2} \;+\; 2 \cdot (-x(x - 2)) \;+\; 5 \cdot \frac{x(x - 1)}{2} $$

By expanding this expression we obtain:

$$ \begin{aligned} y &= \frac{(x - 1)(x - 2)}{2} - 2x(x - 2) + 5 \cdot \frac{x(x - 1)}{2} \\[6pt] &= \frac{x^2 - 3x + 2}{2} - 2x^2 + 4x + \frac{5x^2 - 5x}{2} \\[6pt] &= \left( \frac{x^2 - 3x + 2}{2} + \frac{5x^2 - 5x}{2} \right) - 2x^2 + 4x \\[6pt] &= \frac{x^2 - 3x + 2 + 5x^2 - 5x}{2} - 2x^2 + 4x \\[6pt] &= \frac{6x^2 - 8x + 2}{2} - 2x^2 + 4x \\[6pt] &= 3x^2 - 4x + 1 - 2x^2 + 4x \\[10pt] &= x^2 + 1. \end{aligned} $$

And, as it turns out, this is the same result obtained previously. To conclude this paragraph, why should we prefer this method? Not so much for performance reasons, but because, for the use we will make of it, it is better suited to solving the problem.

It Is Time to Share

Now that we have all the necessary tools, we can finally get to the heart of today’s algorithm. The idea behind SSS is as simple as it is elegant: take a secret, hide it in the constant term of a carefully chosen polynomial, and distribute to the participants points that lie on that polynomial. Taken individually, those points reveal nothing, but when enough of them are combined, they allow us to reconstruct exactly the original polynomial via Lagrange interpolation, and therefore recover the constant term, i.e., the secret. In other words, mathematics becomes the mechanism of distributed trust: each participant holds a fragment of the information and only if there is a large enough group, it becomes possible to reveal the secret.

The steps are simple. We start by choosing two parameters:

  1. The number of participants \(s\), representing the number of points of the polynomial that we want to generate and distribute.
  2. The minimum threshold \(t\), representing the number of participants needed to reconstruct the secret. This value determines that we will use a polynomial of degree \(t-1\).

The relationship between these two variables is that the following constraint must be satisfied:

$$ t \leq s \leq Card(\mathbb{F}) - 1 $$

that is, the number of distributed points must be (obviously) greater than the number of participants needed to reconstruct the message, and it cannot be greater than the cardinality of the finite field from which the elements are taken (as I will show later), excluding the zero element. So, assuming we have \(s=3\) and \(t=2\), we are saying that we create and distribute \(s=3\) points, and to reconstruct the message it is necessary that at least \(t=2\) participants agree. The opposite, even intuitively, makes no sense: how can we create and distribute \(s=2\) but require the agreement of \(t=3\) people? Or, how can we distribute more elements than are available in the field? Obviously you cannot, this is not Italian public debt. This is mathematics.

Once the values of \(s\) and \(t\) have been established, we build our polynomial:

$$ f(x) = S + a_1x + a_2x^2 + \dots + a_{t-1}x^{t-1} $$

Formula 4. Sharing Polynomial

This polynomial is very similar to the one defined here, but there are two small differences:

  • \(S\) is the secret to hide in the polynomial
  • the \(a_i\) are coefficients taken from the Galois field

The polynomial built this way must never be revealed in full, each participant receives only a pair \((x_i, \, f(x_i))\), that is, the evaluation of the polynomial at a point. Taken individually, this value contains no information about the secret, nor about the other coefficients of the polynomial. The secret is reconstructed only when at least \(t\) participants put their points together. To reconstruct \(S\), it is enough to evaluate the polynomial \(f\) at the point \(0\), by computing \(f(0)\). In fact, precisely because of how the polynomial \(f(x)\) is constructed, by setting \(x=0\) all terms (except \(S\)) become zero. Therefore, what we know is:

  • The secret is at point \(0\) of the polynomial \(f(x)\).
  • We do not have access to the polynomial \(f(x)\).

In this context, Lagrange interpolation comes into play. It allows us to extract \(S\) (that is, the polynomial evaluated at \(0\)) without knowing the polynomial itself, but only the points that lie on it. In particular, what one does is compute:

$$ S = f(0) = \sum_{i \in I} y_i \cdot L_i(0) \\[10pt] L_i(0) = \prod_{j=0, \, j \neq i}^{n} \frac{0-x_j}{x_i-x_j} $$

Formula 5. Decryption Formulas

With \(I\) being the set of shares (and later we will see what those are). Up to this point, that is all, and if you think about it, it is not particularly complicated. But that is the beauty of cryptography: using simple tools that are terribly powerful.

Enough chatter, let us look at an example.

Encryption Example

I could use a simple Galois field such as \(\mathbb{F}_{11}\). Since it is composed only of numbers, given that \(p\) is a prime and has power \(1\), performing addition and multiplication operations is simple. Too simple. So why not make life more complicated? Why not use a field \(\mathbb{F}_{2^3}\) where the coefficients are polynomials? And above all, what would I be here for otherwise? Let us take a look.

Let us assume we have a dealer, that is, a key distributor, who uses the field \(\mathbb{F}_{2^3}\) with irreducible polynomial \(\alpha^3 + \alpha^2 + 1\). With this field we can hide at most 3 bits of information. If you have something larger to hide, either use larger fields or split the information into batches. I will only explain how to do it. In any case, as we know from here, the elements of the field in question are:

$$ \{ 0, \, 1, \, \alpha, \, \alpha + 1, \, \alpha^2, \, \alpha^2 + 1, \, \alpha^2 + \alpha, \, \alpha^2 + \alpha + 1 \} $$

Let us further assume that, for some reason, the dealer wants to hide the number of blasphemies I produce in the span of 1 minute, namely 6. Knowing that \(S\) and the coefficients \(a_i\) must belong to the field, the first thing the dealer does is convert \(6\) into a value of the field (note that I chose the number 6 because it is fully representable with 3 bits). Therefore we have:

$$ 6_{10} = (\alpha^2 + \alpha)_{\mathbb{F}_{2^3}} $$

If you are curious to know how I did it, read here. Now the dealer has to choose how many keys (or shares, which we mentioned earlier) to create. Let us assume they want to generate \(s=4\), and that they choose a minimum threshold for reconstructing the information \(t=3\) (note that the field \(\mathbb{F}_{2^3}\) has \(8\) elements and we distribute \(4\) of them, so everything is legitimate). The sharing polynomial will have the form:

$$ f(x) = S + a_1x + a_2x^2 $$

where \(S\) is the information to hide, namely

$$ S = \alpha^2 + \alpha $$

and \(a_1\) and \(a_2\) are field coefficients chosen at random. Let us assume:

$$ \begin{aligned} a_1 &= \alpha + 1 \\[6pt] a_2 &= \alpha^2 + 1 \end{aligned} $$

and therefore the polynomial is:

$$ f(x) = \alpha^2 + \alpha + (\alpha + 1)x + (\alpha^2 + 1)x^2 $$

where:

  • \(\alpha\) is the symbol used to represent the elements of the field \(\mathbb{F}_{2^3}\).
  • \(x\) is the variable of the sharing polynomial on which the evaluation will be performed.

Now that we have fixed the sharing polynomial, the dealer must generate the four shares. So what they do is choose four distinct \(x\) values and evaluate the function \(f(x)\) at the chosen points. Do you remember that in this section we imposed points in the plane through which the polynomial had to pass? Here it is exactly the same thing, but the domain is the Galois field excluding the point \(0\) (because that is the one that hides the information). Since I am as lazy as a sloth, and whoever knows me knows I am not exaggerating given that I often do not even take care of my primary needs (but then I go to the gym, go figure...), let us choose points that do not make the calculations (too) complicated:

$$ x_i = 1, \; \alpha, \; \alpha^2, \; 1 + \alpha $$

and let us start creating the shares to distribute:

  • Share 1:

    $$ \begin{align} (x_1, \, f(x_1)) &= (1, \, f(1)) \\[6pt] f(1) &= \alpha^2 + \alpha + (\alpha + 1)\cdot 1 + (\alpha^2 + 1)\cdot 1^2 \\[6pt] &= 2\alpha^2 + \alpha + 2 \\[10pt] \Rightarrow (1, \, f(1)) &= (1, \, 0) \end{align} $$

  • Share 2:

    $$ \begin{align} (x_2, \, f(x_2)) &= (\alpha, \, f(\alpha)) \\[6pt] f(\alpha) &= \alpha^2 + \alpha + (\alpha + 1) \cdot \alpha + (\alpha^2 + 1) \cdot \alpha^2 \\[6pt] &= \alpha^4 + 3\alpha^2 + 2\alpha \\[10pt] \Rightarrow (\alpha, \, f(\alpha)) &= (\alpha, \, \alpha + 1) \end{align} $$

  • Share 3:

    $$ \begin{align} (x_3, \, f(x_3)) &= (\alpha^2, \, f(\alpha^2)) \\[6pt] f(\alpha^2) &= \alpha^2 + \alpha + (\alpha + 1) \cdot \alpha^2 + (\alpha^2 + 1) \cdot \alpha^4 \\[6pt] &= \alpha^6 + \alpha^4 + \alpha^3 + 2\alpha^2 + \alpha \\[10pt] \Rightarrow (\alpha^2, \, f(\alpha^2)) &= (\alpha^2, \, \alpha) \end{align} $$

  • Share 4:

    $$ \begin{align} (x_4, \, f(x_4)) &= (1 + \alpha, \, f(1 + \alpha)) \\[6pt] f(1 + \alpha) &= \alpha^2 + \alpha + (\alpha + 1) \cdot (1 + \alpha) + (\alpha^2 + 1) \cdot (1 + \alpha)^2 \\[6pt] &= \alpha^3 + \alpha \\[10pt] \Rightarrow (1 + \alpha, \, f(1 + \alpha)) &= (1 + \alpha, \, \alpha^2 + 1) \end{align} $$

In expanding the polynomials, the cyclic property of powers has been applied. If you want to know more, you just need to look here. Let us get back to it. At this point the dealer is done. They have chosen what they needed to choose and generated what they needed to generate. They jealously keep the sharing polynomial, and distribute the shares (each person receives one and only one share) and the field that was used. Let us assume the dealer distributes the shares to Ludwig, Pluto, Donald, and Mickey, claiming that the top-secret information is the code to access Duckstein’s files (and who knows why the password is my number of blasphemies per minute) and discover the truth about the relationships between Mr. T. (president of Duckburg) and Mr. B. (any reference to facts and/or persons is purely coincidental). The dealer tells them that no one, alone, can access the files. It is necessary that at least \(3\) people agree.

Decryption Example

Since no one wants to undermine democracy the information is not used, after all, there is not much time left in the term. Then, however, T. organizes an assault on Duckburg’s town hall to stay in power and, as they say, live by the sword, die by the sword. The four start arguing among themselves and Pluto, who is a dog, is a fervent supporter of the 'Return Duckburg to Greatness' movement and does not want to reveal anything. Fortunately, Ludwig, Donald, and Mickey, who do not live off other people’s ignorance, agree to reveal the secrets that will open the eyes of Duckburg’s inhabitants and, hopefully, kick T. out. So they proceed as follows. Let us assume that three of them have the shares \((x_1, \, f(x_1))\), \((x_2, \, f(x_2))\), and \((x_4, \, f(x_4))\). They also know the finite field used:

$$ \mathbb{F}_{2^3} \pmod{\alpha^3 + \alpha^2 + 1} $$

From the decryption formulas, we know that we must compute

$$ L_1(0), \; L_2(0), \; L_4(0) $$

and therefore, given the shares:

  • \(L_1(0)\):

    $$ \begin{aligned} L_1(0) &= \frac{0 - x_2}{x_1 - x_2} \cdot \frac{0 - x_4}{x_1 - x_4}\\ &= \frac{0 - \alpha}{1 - \alpha} \cdot \frac{0 - (1 + \alpha)}{1 - (1 + \alpha)}\\[6pt] &= 1 \end{aligned} $$

  • \(L_2(0)\):

    $$ \begin{aligned} L_2(0) &= \frac{0 - x_1}{x_2 - x_1} \cdot \frac{0 - x_4}{x_2 - x_4}\\ &= \frac{0 - 1}{\alpha - 1} \cdot \frac{0 - (1 + \alpha)}{\alpha - (1 + \alpha)}\\[6pt] &= 1 \end{aligned} $$

  • \(L_4(0)\):

    $$ \begin{aligned} L_4(0) &= \frac{0 - x_1}{x_4 - x_1} \cdot \frac{0 - x_2}{x_4 - x_2}\\ &= \frac{0 - 1}{(1 + \alpha) - 1} \cdot \frac{0 - \alpha}{(1 + \alpha) - \alpha}\\[6pt] &= 1 \end{aligned} $$

I would like to point out that the values of the \(L_i\) are always \(1\) because I chose an extremely small field. In any case, we can now compute our \(S\). In fact, given the \(y_i\) which are nothing but the \(f(x_i)\) values of the shares, we apply again the little decryption formula:

$$ \begin{aligned} S = f(0) &= f(x_1) \cdot L_1(0) + f(x_2) \cdot L_2(0) + f(x_4) \cdot L_4(0) \\[6pt] &= 0 \cdot 1 + (\alpha + 1) \cdot 1 + (\alpha^2 + 1) \cdot 1 \\[10pt] &= \alpha^2 + \alpha \end{aligned} $$

So the secret is \(S=\alpha^2 + \alpha\) and, knowing that we are in the field \(\mathbb{F}_{2^3} \pmod{\alpha^3 + \alpha^2 + 1}\), we take its integer value using the now overused little table:

$$ (\alpha^2 + \alpha)_{\mathbb{F}_{2^3}} = (110)_{2} = 6_{10} $$

So now Ludwig, Donald, and Mickey together, and only together, not only know my blasphemy rate per minute but can access Duckstein’s files and bring Duckburg back to reason. Try it and you will see: use any subset of 3 elements and you will see that it works. Not one more (in the sense that one more is not necessary), not one less (in the sense that if you have just one less, you cannot reconstruct the secret correctly). This is thanks to the interpolation property that says that, to uniquely define a polynomial of degree \(k\), \(k + 1\) distinct points on the x-axis are necessary.

Considerations

The example I presented, mixing political satire, mathematics, and poetic license, is very simple. To recover the secret, a very trivial brute-force attack would be enough, since each generic \(L_i\) lies within the field \(\mathbb{F}_{2^3}\), which contains 8 elements. Since by design three shares are required, the number of attempts to be made is equal to all possible distinct triplets in a set of 8 elements. Using the formula of the binomial coefficient, we have:

$$ \binom{8}{3} = \frac{8!}{3!(8-3)!} = 56 $$

we have 56 possible triplets. Absolutely feasible. But let us look at the general case:

$$ \binom{n}{k} = \frac{n!}{k!(n-k)!} $$

From here we can proceed in two ways. Either we use Stirling’s approximation or we go with brutal simplifications. Since we are all tired, let us proceed with the latter.

By definition, the factorial is:

$$ n! = n \cdot (n-1) \cdot (n-2) \cdots (n-k+1) \cdot (n-k) \cdot (n-k-1) \cdot (n-k-2) \cdots 1 $$

Applying the definition recursively, we can write that:

$$ (n-k) \cdot (n-k-1) \cdot (n-k-2) \cdots 1 = (n-k)! $$

from which

$$ n! = n \cdot (n-1) \cdot (n-2) \cdots (n-k)! $$

Using this in the initial formula we have:

$$ \begin{aligned} \frac{n!}{k!(n-k)!} &= \frac{n \cdot (n-1) \cdot (n-2) \cdots (n-k+1) \cdot (n-k)!}{k!(n-k)!} \\[6pt] &= \frac{n \cdot (n-1) \cdot (n-2) \cdots (n-k+1)}{k!} \end{aligned} $$

By grouping by \(n\), it is easy to obtain:

$$ \frac{n \cdot (n-1) \cdot (n-2) \cdots (n-k+1)}{k!} = \frac{n^k}{k!} \cdot \left( 1 - \frac{1}{n} \right) \cdot \left( 1 - \frac{2}{n} \right) \cdots \left( 1 - \frac{k - 1}{n} \right) $$

If we consider \(n \to \infty\), what remains is \(\frac{n^k}{k!}\) and therefore, since \(k\) is negligible if we consider it finite, we have that the computational complexity of testing all possible values is:

$$ O(n^k) $$

where \(n\) is the cardinality of the finite field and \(k\) is the minimum threshold to reconstruct the information, and this is a monstrous complexity. Now consider that the fields used are of the type \(\mathbb{F}_{2^{256}}\), which contain about \(\sim 10^{77}\) elements. According to scientific estimates, the number of atoms in the universe is on the order of \(\sim 10^{80}\).

To Wrap Up

Prefacing that I liked all the articles I have written (otherwise I simply would not have written them, obviously), among them all this one was the most fun to write. Both because of the emotional value of its discovery, and because, come on, it is really awesome. I liked it so much that I even wrote a repo about it. You can find it here. You will forgive me if I did not implement Galois fields raw; they are so beautiful and so dear to me, but implementing polynomial reduction from scratch, no thanks at all. Luckily there is a library (galois, look at that creativity) that has implemented much of the dirty work. In any case, I want to conclude the article by inviting you to read Shamir’s original article, because it is very interesting. That said, I am going to share (in pieces, now that I can) my secrets with someone.

Until next time.


References

Published

Category

Encryption & Security

Tags

Contacts