Currently at Xord we are researching ZK-SNARKs in depth. Although Maksym Petkus does an excellent job of explaining why and how ZK-SNARKs works. I am still left with many questions at the time of reading the paper. In doing so, my purpose of writing is to make this research more accessible and take a deep dive into the low level concepts. This article will discuss each step of ZK-SNARKs at low level from polynomials to general purpose zero knowledge proofs.
Legacy blockchains such as Bitcoin and Ethereum are facing the huge problem of long queues of verification blocks due to their over-growing state size. With zero-knowledge we don’t need to replay each block for verification. We have tons of resources available in bits and pieces but the concrete picture is still indispensable.
Zero-knowledge succinct non-interactive arguments of knowledge or simply ZK-SNARK is a shrewd method of proving something is true or false without revealing any sacred information.
It was proposed in 1980’s by three MIT researchers Shafi Goldwasser, Silvio Micali and Charles Rackoff. Back then, they were working on an Interactive proof system where the prover interacts with the verifier to convince him about something without declaring any specific knowledge.
Here is the hierarchical research of zero-knowledge proofs since its inception in 1985.
A zk protocol must satisfy these three properties:
Let us assume that a verifier wants to verify certain transactions in a block. For simplicity, take 20 transactions in a block. The verifier can only check one transaction at a time. In order to verify the whole block the verifier can proceed with arbitrary positions of a block with 1/20 confidence. A verifier must proceed until he reaches certain confidence. The downside of such a proving method is that one must go through a number of checkpoints proportional to the number of transactions which is impractical if we include thousands of transactions in a block. Instead we use the moon math of zero-knowledge to get the assurity of the prover’s claim.
The first step in the journey of ZK-SNARKs is to create a function to write proofs. This should work in any language but realistically this won’t provide much good to ZK-SNARKs construction. ZK-SNARKs work with specific construction which is also known as Quadratic Arithmetic Programs.
I have written a basic guide to QAPs as well. You can read it from here.
To begin with, suppose we have a polynomial x3-5x2+5x-3 which represents some graph points. The degree of the polynomial is the greatest exponent of x, which in this case is 3. Note here, that for blockchain this degree is directly proportional to the amount of data need to represent.
Polynomials provide an advantageous property, which is if we have two different polynomials with degree d. They can not have the same intersection points more than the d points. Means if the prover slightly tries to change the original data he can not convince the verifier. To elaborate, let's slightly alter the above equation.
We can see that even the smallest change in the coefficients can produce a dramatically different result.
To mathematically evaluate the shared points we can equate both the equations.
x3-5x2+5x-2 = x3-5x2+8x-3
x=1/3 is the only point where the two polynomials intersect. Thus we can assure that the evaluation of any polynomial at any arbitrary position is compelled to represent a unique value. No two different polynomials could commit to the same data points. This property provides another great advantage which is:
If a prover is claiming to know some polynomial at some large degree d, the verifier can easily verify by evaluating that polynomial in different checkpoints. The verifier will simply:
This is the main reason why polynomials have high significance in ZK-snarks. We will see the working of above points in a short while but first we need to understand the basic concepts of polynomial and its representation.
A polynomial can be expressed as:
c0x0+c1x1+. . . .+cnxn
So if a person claimed that he or she knows the polynomial, what exactly they mean is that they have the knowledge of c0,c1,. . .cn. So basically these coefficients can store the account address and the corresponding values in two different polynomials which will then be represent as a tuple with their corresponding indexes. We can also represent this polynomial as a product of its factors.
(x-α0)(x-α1). . . (x-αn)=0
Take a polynomial x3-4x2+3x= (x-0)(x-1)(x-3)----------------------------------- Equation1
Now if a prover claims that he knows a polynomial with degree 3 and have roots 1 and 3 without disclosing the polynomial itself, he needs to prove that indeed his polynomial p(X)have the roots t(x)=(x-1)(x-3) also known as target polynomial and some arbitrary polynomial h(x). The target polynomial for now made public by the prover and can be use by both prover and verifier.
Meaning the prover is not revealing all the roots however, whatever the target roots he is revealing is going to form a polynomial after the multiplication of remaining roots.
such as in the above equation1 he is making the public roots and target polynomial-> (x-1)(x-3)
The h(x) here is the hidden root (x-0).
In simple words there must exist a polynomial h(x) which makes p(x)=t(x).h(x). so the most easiest way of finding h(x) is to take the ratio of p(X) and t(X).
If the prover is unable to find such h(x), this indicates that the cofactors t(x) are not the correct.
In our example if we divide the p(x) x3-4x2+3x to t(X)=(x-1)(x-3)=x2-4x+3
We have got the result h(X)=x with remainder 0 which is the other root of the equation.
Now, using this construction we can check the validity of a polynomial without learning the polynomial itself which kinda lies in the zone of zero-knowledge and succinctness. Nonetheless there are multiple issues with this scheme.
What if the prover does not know the polynomial p(x) or don't acquire the full blockchain data? so how can he convince the verifier about some false polynomial?
He can do this, because the prover knows x=r in advance he can calculate the t=t(r) by selecting some random h(r) which proves to be p(r)=t(r).h(r), which will be accepted by the verifier as the equation holds.
This means with public t(x) he can compute the t(r) and guess the polynomial h(r)such that it fullfills this condition.
The issue lies with r and t(r) which the prover knows in advance. We can solve this issue If the verifier, instead of providing the original r, gives something which can not be tempered and served as a blackbox such as the hash of r. However, we must not forget that we need to perform arithmetic calculations over r which in normal encryption is not possible with hashes.
So to combat this, researchers have introduced a new strong encryption method known as “Strong Homomorphic Encryption”, which allows arithmetic operations on hashes. We will discuss about these arithmetic operations in details.
Backed with such technology, we can evaluate a polynomial with an encrypted value of x or random r provides by the verifier. We now take a step by step approach of designing the SNARK protocol. From completeness to zero-knowledge to non-interactivity.
Before proceeding we must know that the homomorphic encryption doesn’t allow exponentiation of an encrypted value. hence we cant perform this action: E(x), E(x)2and E(x)3 instead the prover must be given the values of power of x from 1 to 3: E(x), E(x2 )and E(x3)so the evaluation for the equation x3-3x2+2x will be E(x3)1.E(x2)-3.E(x)2
So the protocol for polynomial with degree 3 will look like this:
E(ri)=gri such as ( gr0, gr1, . . grd) and provide these encrypted powers to the prover.
Now with taking the equation mentioned above we redesign our protocol.
Although this protocol has restricted the prover’s celerity, there are still few conditions exists where the prover can forge the proof without actually using the given encrypted power of random r.
The knowledge of a polynomial hides in its coefficients c0, c1, . . .cn , or in (exponential form) in its exponents P(xi)ci=gci. xi. A prover needs to compute gp, gh. However, he is not forced to do that. He, the prover can compute any arbitrary values such as zp, zh to satisfy the equation zp= (zh)t(x). For example for any random value r, the prover can compute zh=gr and zp=(gt(x))r where, gt(s) can be computed from the public encryption power of x. We get these public powers by initial setup ceremony or the preprocessing of SNARK.
To understand this let's consider our previous example of third degree polynomial f(x)=x3-3x2+2x . What we desire here is that only encryption of x is homomorphically multiplied by some arbitrary coefficient c and nothing else. Let's see how we could do this? We could do this by using alpha shift or knowledge of exponent assumption.
Suppose Eve has a value “α”, that she wants Jacob’s to exponentiate to any power. The only requirement here is that only this specific α should get exponentiated, nothing else. This is how we design a protocol to do that.
Now, Jacob can not retrieve the alpha shift from the provided tuple. The only way to produce a right response is to exponentiate the given tuple.
Having the response (x, x') and r Eve checks the equality.
Hence, this protocol provides a proof to Eve that Jacob has exponentiated α by some value only known to him, and indeed he has not performed any other operation such as addition or multiplication since doing this will disturb the alpha shift. This provides full confidence to the system.
To continue with our example of x3-3x2+2x . We can again re-design the protocol with a similar approach.
We now have a robust protocol which has restricted the prover to only use the provided encryption as without this the α-shift will not be preserved. We now have reached the completeness and few basic scenarios of soundness of the protocol but the drawback in zero-knowledgness is still there.
Maksym in his paper has provided a very strong argument which states that “while theoretically polynomial coefficients ci can have a vast range of values, in reality, it might be quite limited , which means that the verifier could brute-force limited range of coefficients combinations until the result is equal to the prover’s answer. For instance if we consider the range of 100 values for each coefficient, the degree 2 polynomial would total to 1 million of distinct combinations, which considering brute-force would require less than 1 million iterations. Moreover, the secure protocol should be secure even in cases where there is only one coefficient, and its value is 1”. Because the verifier can extract the polynomial knowledge using the data provided by the prover, we need to make certain implications in the protocol.
Suppose the proof provided is gp,gp',gh and they have to pass the following checks:
We can use the similar approach of shift property to alter the proof in a way that the validity still holds but no knowledge has been revealed. We can shift these values with some random .
After improvements the checks still hold.
g δ.p=g δ.t(s)h
g δ.αp=g δ.p'
In the above step we have made a protocol which can securely transmit the proof without revealing any knowledge.
The last step towards our writing is non-interactivity.
The question arises here is why we really need non-interactivity? There might be the following reasons for doing so.
So we need the secret parameters (t(s),α) to be reusable, public, trustworthy and infeasible to forge. We can do this by using the same encryption method the verifier has used for his secret s but to note here that we or the verifier needs to perform multiplication with these secret parameters and we can’t do that with homomorphic encryption. That's where pairings come in.
Parings leverage the elliptic curves to achieve the non-interactivity and soundness of the protocol. In Zero-knowledge we want to destroy most of the information or somehow design functions in way that they should produce an output which are completely irrelevant to the input. However, in order to prevent the prover from cheating and generating fake proofs the verifier must know what statement is being proven which is linked to the soundness of scheme. Pairing in itself is a deep topic which we are leaving for another post but below i have tried to give a gist of this blackbox.
Pairings or bilinear mappings is a cryptographic function which can be denoted as e(g*,g*) which have two encrypted inputs (g α,gb) from one set of numbers which map them to another set of numbers as their multiplication representation such as e(g α,g b)=e(g,g)αb.
Since the multiplication has a different set of numbers than the inputs therefore, we can not derive the input from it. We have a property in cryptography which swaps the base to the exponent while producing the same answer as before the swap.
Since the multiplication of ab will give a different g so the encrypted multiplication can not be guessed. We can even perform addition with pairings too.
Here are some of the properties of pairings.
How pairings helps us in constructing SNARK, we will see that in the next post.