Bulletproofs

In Layman's terms

Before we start, let's recall our goal with rangeproofs: prove that the amount that we are sending lies between $0$ and $2^{64}-1$ atomic units (piconeros) (and consequently it is not negative as well).

Explaining Bulletproofs (BP) in layman's terms is not an easy task so let's do justice to the saying "a picture is worth a thousand words" to get the motivations to do it. Below you will find an illustration of a standard one input and two outputs transaction using confidential transactions versus bulletproofs.

Fig. 1 - Comparison between Borromean and Bulletproof rangeproofs.

As you remember from Borromean signatures, we had to store one commitment $C_i$ for each bit $i$ of the amount, but what if we could prove that we know all the $C_i$ values without actually storing all of them? This is the main idea of Bulletproofs. We will do some math tricks to prove that we know a very long vector without actually storing it.

Math behind

Before I start, I would like to emphasize that it is only my understanding of how Bulletproofs work and it does not cover all its aspects and it may not have the depth that you are looking for. There are actually much better materials available, from which I built my understanding of BP. Please look at the Resources page to explore them. With that said, let's start just another text from someone trying to introduce BP.

If we look at the "ecdhInfo" of a transaction, we will see that the amount information is public stored at the blockchain (not explicitly but as a commitment). Let's start from there. Suppose we have a transaction with commitment $C = xG + vH$ where $x$ is the blinding factor and $v$ is the amount. Our objective with Bulletproofs (a form of rangeproofs) is to prove, in a very compact way, without revealing $v$, that it lies between $0$ and $2^{n}-1$ where $n=64$ in Monero.

Let's start structuring the problem in math terms. We want to prove that $0 < v < 2^{n}-1$ and in order to do so, it suffices to prove that the following three statements are true:

  1. $v ={\langle {\mathbf{a}_{L}}, {\mathbf{2}^{n}} \rangle}$ , where $\textbf{a}_L$ is the binary vector representing $v$ and $\textbf{2}^n$ is the power of two vector where $n=64$
  2. $\textbf{a}_{R} = \textbf{a}_{L} - \textbf{1}^{n}$
  3. $\textbf{a}_{L} \circ \textbf{a}_{R} = \textbf{0}^{n}$
For example, if $v = 5$ and $n=3$, we have the range of possible values going from 0 to $2^3 -1 = 7$ and the above equations can be easily checked for this case:
  • $v = 5$
  • $\textbf{2}^n = [2^0,2^1,2^2] = [1,2,4] $
  • $\textbf{a}_L = [1,0,1]$
  • $\textbf{a}_R = [0,-1,0]$

We observe that if the vector $\textbf{a}_L$ is not made of 0s and 1s, then the product $\textbf{a}_{L} \circ \textbf{a}_{R}$ won't be 0. Now we just need to find a way to combine these three statements into one so we can simply verify whether that new equation created holds. Let's start by slightly modifying 2) and 3) with the goal of letting it as a dot product that is equal to 0.

We will make use of the Schwartz-Zippel lemma which roughly states that the probability of finding a root for a polynome of order $n$ between all the possible numbers $N$ is roughly $n/N$. As we are playing with modular arithmetics, $N$ is pretty big and finding a root is very unlikely unless all the coefficients are zero. (For a random $y$ different from 0 and given the polynome $P(y) = a_0+a_1y^1+a_2y^2+...+a_ny^n$, $P(y)=0$ if $a_0=a_1=a_2=...=a_n=0$ with a negligible probability otherwise.)

  1. $v ={\langle {\mathbf{a}_{L}}, {\mathbf{2}^{n}} \rangle}$ , where $\textbf{a}_L$ is the binary vector representing $v$ and $\textbf{2}^n$ is the power of two vector where $n=64$
  2. $\textbf{a}_{R} = \textbf{a}_{L} - \textbf{1}^{n} \qquad \longrightarrow \qquad {\langle {\textbf{a}_{L} - \textbf{1}^n - \textbf{a}_{R}, \textbf{y}^n} \rangle} = 0 $
  3. $\textbf{a}_{L} \circ \textbf{a}_{R} = \textbf{0}^{n} \qquad \longrightarrow \qquad {\langle {\textbf{a}_{L} \circ \textbf{a}_{R}, \textbf{y}^n} \rangle} = 0 $
We did this manipulation to have all the terms equal to zero and as dot products. Now we can manipulate a bit this terms to find an equation that would only be valid if all these equations are true. Let's use again the polynome trick to represent this single equation as it is a valid assumption to make as we want to have linearly indepent (LI) terms. We can multiply each term by another random scalar, called $z$, and group them into one equation. As we want to have LI terms again and we have only three terms, we need a polynome of order 2. Let's do it:
  1. $vz^2 ={\langle {\mathbf{a}_{L}}, {\mathbf{2}^{n}} \rangle} z^2$
  2. ${\langle {\textbf{a}_{L} - \textbf{1}^n - \textbf{a}_{R}, \textbf{y}^n} \rangle} z = 0 z = 0 $
  3. ${\langle {\textbf{a}_{L} \circ \textbf{a}_{R}, \textbf{y}^n} \rangle} = 0 $

Now let's rewrite the terms as a polynomial equation: $vz^2 + 0 + 0 ={\langle {\mathbf{a}_{L}}, {\mathbf{2}^{n}} \rangle} z^2 + {\langle {\textbf{a}_{L} - \textbf{1}^n - \textbf{a}_{R}, \textbf{y}^n} \rangle} z + {\langle {\textbf{a}_{L} \circ \textbf{a}_{R}, \textbf{y}^n} \rangle}$

Applying some math manipulations, it is possible to achieve the independent term as we want (see Dalek's equations):

Finally, we want to combine these terms into a single inner product. Our goal is to rearrange the inner product above so that terms involving \({\mathbf{a}}_{L}\) appear only on the left-hand side, terms involving \({\mathbf{a}}_{R}\) appear only on the right-hand side, and non-secret terms (which the verifier can compute on its own) are factored out into a new term \(\delta(y, z) \).

First, break the statement into simpler terms, then rearrange: \[ \begin{aligned} z^2 v &= z^2 {\langle {\mathbf{a}}_{L}, {\mathbf{2}}^n \rangle} + z {\langle {\mathbf{a}}_{L}, {\mathbf{y}}^n \rangle} - z {\langle {\mathbf{a}}_{R}, {\mathbf{y}}^n \rangle} - z {\langle {\mathbf{1}}, {\mathbf{y}}^n \rangle} + {\langle {\mathbf{a}}_{L}, {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} \\ z^{2} v + z {\langle {\mathbf{1}}, {\mathbf{y}}^{n} \rangle} &= z^2 {\langle {\mathbf{a}}_{L}, {\mathbf{2}}^n \rangle} + z {\langle {\mathbf{a}}_{L}, {\mathbf{y}}^n \rangle} - z {\langle {\mathbf{1}} , {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} + {\langle {\mathbf{a}}_{L}, {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} \\ z^{2} v + z {\langle {\mathbf{1}}, {\mathbf{y}}^{n} \rangle} &= {\langle {\mathbf{a}}_{L}, z^{2} {\mathbf{2}}^n \rangle} + {\langle {\mathbf{a}}_{L}, z {\mathbf{y}}^n \rangle} + {\langle -z {\mathbf{1}} , {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} + {\langle {\mathbf{a}}_{L}, {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} \\ z^{2} v + z {\langle {\mathbf{1}}, {\mathbf{y}}^{n} \rangle} &= {\langle {\mathbf{a}}_{L}, z^{2} {\mathbf{2}}^n + z {\mathbf{y}}^{n} + {\mathbf{a}}_{R} \circ {\mathbf{y}}^{n} \rangle} + {\langle -z {\mathbf{1}} , {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} \end{aligned} \] To combine the terms on the right-hand side, add \({\langle -z {\mathbf{1}}, z^2 {\mathbf{2}}^n + z {\mathbf{y}}^n \rangle}\) to each side, then simplify: \[ \begin{aligned} z^{2} v + z {\langle {\mathbf{1}}, {\mathbf{y}}^{n} \rangle} - {\langle z {\mathbf{1}}, z^2 {\mathbf{2}}^n + z {\mathbf{y}}^n \rangle} &= {\langle {\mathbf{a}}_{L}, z^{2} {\mathbf{2}}^n + z {\mathbf{y}}^{n} + {\mathbf{a}}_{R} \circ {\mathbf{y}}^{n} \rangle} \\ &+ {\langle -z {\mathbf{1}} , z^2 {\mathbf{2}}^n + z {\mathbf{y}}^n + {\mathbf{a}}_{R} \circ {\mathbf{y}}^n \rangle} \\ z^2 v + (z - z^2) {\langle {\mathbf{1}}, {\mathbf{y}}^n \rangle} - z^3 {\langle {\mathbf{1}}, {\mathbf{2}}^n \rangle} &= {\langle {\mathbf{a}}_{L} - z{\mathbf{1}}, z^{2} {\mathbf{2}}^n + z {\mathbf{y}}^{n} + {\mathbf{a}}_{R} \circ {\mathbf{y}}^{n} \rangle} \end{aligned} \] Combining all non-secret terms outside the inner product \[ \delta(y,z) = (z - z^{2}) {\langle {\mathbf{1}}, {\mathbf{y}}^{n} \rangle} - z^{3} {\langle {\mathbf{1}}, {\mathbf{2}}^{n} \rangle}, \] we finally obtain \[ z^{2}v + \delta(y,z) = {\langle {\mathbf{a}}_{L} - z {\mathbf{1}}, {\mathbf{y}}^{n} \circ ({\mathbf{a}}_{R} + z {\mathbf{1}}) + z^{2} {\mathbf{2}}^{n} \rangle}. \] This is equivalent to the original inner-product equation, but has a single inner product with \({\mathbf{a}}_{L}\) on the left, \({\mathbf{a}}_{R}\) on the right, and non-secret terms factored out. Let's call the left-hand side of the single inner product equation "unblinded" \({\mathbf{l}(x)}\) and the right-hand side "unblinded" \({\mathbf{r}(x)}\), such that \[ \begin{aligned} \text{unblinded } \mathbf{l}(x) &= {\mathbf{a}}_{L} - z {\mathbf{1}} \\ \text{unblinded } \mathbf{r}(x) &= {\mathbf{y}}^{n} \circ ({\mathbf{a}}_{R} + z {\mathbf{1}}) + z^{2} {\mathbf{2}}^{n} \\ z^{2}v + \delta(y,z) &= {\langle \text{unblinded } \mathbf{l}(x), \text{unblinded } \mathbf{r}(x) \rangle} \end{aligned} \]

After some math and rearranging the terms in order to have a dot product between terms only dependent on $\textbf{a}_L$ and $\textbf{a}_R$, we achieve the following equation: $z^{2}v + \delta(y,z) = {\langle {\mathbf{a}}_{L} - z {\mathbf{1}}, {\mathbf{y}}^{n} \circ ({\mathbf{a}}_{R} + z {\mathbf{1}}) + z^{2} {\mathbf{2}}^{n} \rangle}$

Let's now name the left side of the equation $\text{unblinded }\textbf{l}(\textbf{a}_L,z)$ and observe that it is only a function of $\textbf{a}_L$ while the right side is now named $\text{unblinded }\textbf{r}(\textbf{a}_R,y,z)$ and is a function of $\textbf{a}_R$ (the result of the inner product is a scalar).

$z^{2}v + \delta(y,z) = {\langle \text{unblinded } \mathbf{l}, \text{unblinded } \mathbf{r} \rangle}$

We name these terms unblinded because we cannot send them to the verifier without exposing $\textbf{a}_L$ and $\textbf{a}_R$. If we did so, we would reveal the value $v$ that we are trying to hide. So, in order to make it zero-knowledge, we need to blind these terms and make commitments to them. Let's start by rewritting the left and right side of our dot product with a blinding vector defined as $\textbf{l}(x)$ and $\textbf{r}(x)$ as:

\[ {\mathbf{l}}(x) = {\mathbf{l}}_{0} + {\mathbf{l}}_{1} x = ({\mathbf{a}}_{L} + {\mathbf{s}}_{L} x) - z {\mathbf{1}} \\ \] \[ {\mathbf{r}}(x) = {\mathbf{r}}_{0} + {\mathbf{r}}_{1} x = {\mathbf{y}}^{n} \circ \left( ({\mathbf{a}}_{R} + {\mathbf{s}}_{R} x\right) + z {\mathbf{1}}) + z^{2} {\mathbf{2}}^{n} \]

Now we have our inner product as one equation and if we can prove that it is true than we would achieve our goal of proving that $v$ lies between $0$ and $2^{64} - 1$. Now the tricky and ingenious part comes to play. Our inner product is a function of $a_L$ and $a_R$ which are pretty big vectors (128 scalars each) and we don't want to store them in the blockchain, otherwise we would not get any advantage in relation to the previous Sigma protocol or Confidential Transactions that we had before. The idea here is to find a way to store the proofs in a manner that it scales under $O(n)$ in space (which is the case of Confidential Transactions). It turns out that it is possible to do it in $O(2log_2 n)$, which is a great improvement! For instance, for a vector of n=128 scalars, we could store only 14 scalars.

Let's see how Monero specifically does it by first considering the general ideas and equations and then a small example to understand it.

We will use the method 'Divide and Conquer' so at each step we will reduce our problem into a similar problem but with a smaller size until we reach a trivial solution and we can propagate back the final result to solve the big problem.

So, in the beginning, we have our vectors $\textbf{l}$ and $\textbf{r}$ and its inner product $t={\langle \mathbf{l}, \mathbf{r} \rangle}$.

Let's define $P$ as a function of $\textbf{l}$ , $\textbf{r}$ and $t={\langle \mathbf{l}, \mathbf{r} \rangle}$. Where $U$ is a point known by the verifier and prover.

\[ P = \textbf{a} \textbf{G} + \textbf{b} \textbf{H} + {\langle \mathbf{a}, \mathbf{b} \rangle}U \]

Notice that this is the main equation that we want to prove its validity. If this equation holds, then we know that the creator knows $\textbf{a}$ and $\textbf{b}$ and its inner product. (Moreover, we also know that $\textbf{a}_L$ is made of bits and $v$ lies on the desired range). Our goal will be to store less data than $\textbf{a}$ and $\textbf{b}$ themselves while letting the verifier know that this equation is true. Only the value P will be stored in the blockchain.

Now let's define the variables that will be in the recursion, where $x$ is a scalar that depends on $\textbf{L}$ and $\textbf{R}$ and the sub-indices $l$ and $h$ represent the lower and higher part of the vector when split into two parts, respectively:

\[ c_L = {\langle \textbf{a}_l , \textbf{b}_h \rangle} \] \[ c_R = {\langle \textbf{a}_h , \textbf{b}_l \rangle} \] \[ L = \textbf{a}_l \textbf{G}_h + \textbf{b}_h \textbf{H}_l + c_L U \] \[ R = \textbf{a}_h \textbf{G}_l + \textbf{b}_l \textbf{H}_h + c_R U \] \[ \textbf{G}' = \textbf{G}_l x^{-1} + \textbf{G}_h x \] \[ \textbf{H}' = \textbf{H}_l x + \textbf{H}_h x^{-1} \] \[ \textbf{a}' = \textbf{a}_l x + \textbf{a}_h x^{-1} \] \[ \textbf{b}' = \textbf{b}_l x^{-1} + \textbf{b}_h x \] Notice again that we are trying to find a way to store as less information as possible in the blockchain while maintaining a solid mathematical proof of the inner product $c={\langle \mathbf{a}, \mathbf{b} \rangle}$, where $\textbf{a}=\textbf{l}$, $\textbf{b}=\textbf{r}$ and $c=t$ initially. According to the above definitions, let's see how we can express $c'$ and $P'$: \[ c' = {\langle \mathbf{a}', \mathbf{b}' \rangle} = {\langle \textbf{a}_l x + \textbf{a}_h x^{-1}, \textbf{b}_l x^{-1} + \textbf{b}_h x \rangle} = \underbrace{ {\langle \textbf{a}_l, \textbf{b}_l \rangle} + {\langle \textbf{a}_h, \textbf{b}_h \rangle} }_{ {\langle \mathbf{a}, \mathbf{b} \rangle} } + \underbrace{ {\langle \textbf{a}_l , \textbf{b}_h \rangle} }_{c_L} x^2 + \underbrace{ {\langle \textbf{a}_h , \textbf{b}_l \rangle} }_{c_R} x^{-2} = c+ c_L x^2 + c_R x^{-2} \] \[ P' = \textbf{a}' \textbf{G}' + \textbf{b}' \textbf{H}' + {\langle \mathbf{a}', \mathbf{b}' \rangle}U \] Let's start by expressing $\textbf{a}' \textbf{G}'$ , $\textbf{b}' \textbf{H}'$ and ${\langle \mathbf{a}', \mathbf{b}' \rangle}$ separately. \[ \textbf{a}' \textbf{G}' = (\textbf{a}_l x + \textbf{a}_h x^{-1})(\textbf{G}_l x^{-1} + \textbf{G}_h x) = \textbf{a}_l \textbf{G}_l + \textbf{a}_h \textbf{G}_h + \textbf{a}_l \textbf{G}_h x^2 + \textbf{a}_h \textbf{G}_l x^{-2} \] \[ \textbf{b}' \textbf{H}' = (\textbf{b}_l x^{-1} + \textbf{b}_h x)(\textbf{H}_l x + \textbf{H}_h x^{-1}) = \textbf{b}_l \textbf{H}_l + \textbf{b}_h \textbf{H}_h + \textbf{b}_l \textbf{H}_h x^{-2} + \textbf{b}_h \textbf{H}_l x^2 \] \[ {\langle \mathbf{a}', \mathbf{b}' \rangle}U = (c+ c_L x^2 + c_R x^{-2}) U \] Now let's try to express everything together to see how $P'$ looks. \[ P' = \underbrace{\textbf{a}_l \textbf{G}_l + \textbf{a}_h \textbf{G}_h }_{\textbf{a} \textbf{G} } + \underbrace{ \textbf{b}_l \textbf{H}_l + \textbf{b}_h \textbf{H}_h }_{ \textbf{b} \textbf{H} } + \underbrace{ \textbf{a}_l \textbf{G}_h x^2 + \textbf{b}_h \textbf{H}_l x^2 + c_L x^2 U }_{ Lx^2 } + \underbrace{ \textbf{a}_h \textbf{G}_l x^{-2} + \textbf{b}_l \textbf{H}_h x^{-2} + c_R x^{-2}U }_{ Rx^{-2} } + \underbrace{ cU }_{ {\langle \mathbf{a}, \mathbf{b} \rangle} } \] \[ P' = \underbrace{ \textbf{a} \textbf{G} + \textbf{b} \textbf{H} + {\langle \mathbf{a}, \mathbf{b} \rangle}U }_{ P } + Lx^2 + Rx^{-2} \] \[ P' = P + Lx^2 + Rx^{-2} \]

Let's take a step back and look what we have accomplished.

  1. We started storing P in the blockchain (which is the commitment to $\textbf{a}$,$\textbf{b}$ and our inner product).
  2. At each round we stored the points $L$ and $R$ (typically for a vector containing 128 scalars, we will make 7 rounds and store 7 values for $L$ and $R$ each) and we cut by half the dimension of $\textbf{a}$,$\textbf{b}$,$\textbf{G}$,$\textbf{H}$ (without storing them in the blockchain).
  3. By the end of each round, we can compute $P'$ which depends on the previous $P$, $L$ and $R$ and known values of $x$.
  4. In the last round, $P'$ is trivially calculated as the last dot product $c' = {\langle \mathbf{a}', \mathbf{b}' \rangle}$ will be a number and we will therefore know its value.
  5. We store the last $a$ and $b$ scalars so the verifier can reconstruct the tests (as he knows how to construct $\textbf{G}$ and $\textbf{H}$) without knowing the whole scalar vectors $\textbf{a}$ and $\textbf{b}$.
  6. By knowing the last $P'$, we can reconstruct all previous $P$ and verify that our initial equation holds.

Finally, we can rewrite the above procedure into one equation in order to test if P is true as informed.

\[ L_{1} x_{1}^{2} + \cdots + L_{k} x_{k}^{2} + P' + R_{k} x_{k}^{-2} + \cdots + R_{1} x_{1}^{-2} \overset ? = aG + bH + abU \]

Where the scalars $a$ and $b$ and the point vectors $L$ and $R$ are stored in the blockchain. The values of $x_1 \cdots x_k$ can be retrieved as a function of $L$ and $R$ for each step as part of the non-interactive Fiat-Shamir protocol.

Finally, we store in the blockchain for the bulletproof range-proofs:

  • $V$: a vector of points with commitments to $v[i]$ and hiding values $\gamma[i]$.
  • $A$: a point which represents the commitment to $\textbf{a}_L$ and $\textbf{a}_R$ and hiding value $\alpha$.
  • $S$: a point which represents the commitment to $\textbf{s}_L$ and $\textbf{s}_R$ and hiding value $\rho$.
  • $T1$: a point which represents the commitment to $t1$ and hiding value $\tau_1$.
  • $T2$: a point which represents the commitment to $t2$ and hiding value $\tau_2$.
  • $taux$: a scalar, which represents the hiding value related to $T1$, $T2$, $V$ and $t$.
  • $mu$: a scalar, which represents the hiding value related to $A$ and $S$.
  • $\textbf{L}$: a vector of points of size $log_2(64 * \text{outputs})$ containing the proving steps for the inner product.
  • $\textbf{R}$: a vector of points of size $log_2(64 * \text{outputs})$ containing the proving steps for the inner product.
  • $a$: a scalar which represents the value of $\text{a}$ in the last step of the inner product.
  • $b$: a scalar which represents the value of $\text{b}$ in the last step of the inner product.
  • $t$: a scalar, which represents the inner product value to be tested.