Note: This is a draft article.

Idea of this series: We try to obtain a solid grasp on current zk-snark protocols by implementing groth16 from the ground up. The field has evolved a lot in the past years and for a part-time project it’s not feasible to come up with a production-grade implementation. The drawback is that we won’t learn much about possible side-channel attacks, but instead we focus more on the mathematical workings of the protocols.

1. Mathematical Review

1.1 Modular Arithmetic

Given an integer n>1n > 1, we say two a,bZa,b \in \mathbb{Z} are congruent modulo nn, if nn is a devisor of their difference, that is if there exists an integer kk, such that kn=abkn = a - b. This defines an equivalence relation nZ×Z\equiv_n \subseteq \mathbb{Z}\times\mathbb{Z}. A little more explicit, this means for any a,b,cZa,b,c \in \mathbb{Z}:

  • n\equiv_n is reflexive: anaa \equiv_n a,
  • n\equiv_n is symmetric: anba \equiv_n b if and only if bnab \equiv_n a,
  • n\equiv_n is transitive: If anba \equiv_n b and bncb \equiv_n c then anca \equiv_n c.

More commonly, congruence modulo nn is denoted as ab(modn)a \equiv b \pmod{n}. A few examples: Since 142=1214 - 2 = 12 which is a multiple of 44 we have 142(mod4)14 \equiv 2 \pmod{4} and since 821=638^2 - 1 = 63 is a multiple of 33 we have 821(mod3)8^2 \equiv 1 \pmod{3}. The definition applies to negative integers as well: Since 2(3)=152-(-3) = 1\cdot 5 we have 23(mod5)2 \equiv -3 \pmod{5} and from 87=15=35-8-7 = -15 = -3 \cdot 5 it follows that 872(mod5)-8 \equiv 7 \equiv 2 \pmod{5}.

The equivalence classes of the congruence relation n\equiv_n are called congruency classes and are comprised of integers congruent modulo nn. Sticking with the previous examples, here is the congruence class 2\overline{2} of 2 modulo 5: 25={,13,8,3,2,7,12,17,}.\overline{2}_5 = \lbrace …, -13, -8, -3, 2, 7, 12, 17, …\rbrace. More generally, for aZa \in \mathbb{Z} its congruence class modulo nn is the set an={a+knkZ}.\overline{a}_n = \lbrace a + k\cdot n\,|\,k \in \mathbb{Z} \rbrace.

1.2 Ring of Integers modulo n

For a fixed n>0n > 0 we collect the congruence classes modulo nn in a set Z/nZ:={anaZ}={0n,1n,2n,,n1n}\mathbb{Z}/n \mathbb{Z} := \lbrace \overline{a}_n\,|\,a \in \mathbb{Z} \rbrace = \lbrace \overline{0}_n, \overline{1}_n, \overline{2}_n, …, \overline{n-1}_n \rbrace and equip it with an addition operation an+bn:=a+bn\overline{a}_n + \overline{b}_n := \overline{a + b}_n and a multiplication operation anbn:=(ab)n\overline{a}_n \cdot \overline{b}_n := \overline{(a \cdot b)}_n. Additionally, multiplication is distributive with respect to addition. Together with these two operations the set Z/nZ\mathbb{Z}/n\mathbb{Z} is called the ring of integers modulo nn. Here (Z/nZ,+)(\mathbb{Z}/n\mathbb{Z}, +) is an abelian group and we can therefore define a subtraction operation as anbn:=(a+(b))n\overline{a}_n - \overline{b}_n := \overline{(a + (-b))}_n. However, in general (Z/nZ,)(\mathbb{Z}/n\mathbb{Z}, \cdot) doesn’t have a group structure. Multiplication is associative, commutative and we have a neutral element 1n\overline{1}_n, but not every element has a unique inverse. This is where the trouble starts.

It is customary to suppress the distinction of the congruence class and its elements in practice and we will therefore also relax our notation somewhat in the following. That is instead of an\overline{a}_n we are just going to write aa. An element aZ/nZa \in \mathbb{Z}/n\mathbb{Z} has a multiplicative inverse if and only if gcd(a,n)=1\gcd(a,n) = 1 (they are coprime). Elements that have a multiplicative inverse are called units, and they are denoted a1a^{-1}. The set of units is typically denoted as (Z/nZ)×:={aZ/nZa has an inverse mod n}.(\mathbb{Z}/n\mathbb{Z})^{\times} := \lbrace a \in \mathbb{Z}/n\mathbb{Z}\,|\, a \textrm{ has an inverse mod n} \rbrace. Since this set is closed under multiplication, that is the product of two units is again a unit (exercise), it is a group. On the group of units we can finally define a division operation as ab1a \cdot b^{-1}.

1.3 Finite Fields

From the condition for the existence of a multiplicative inverse follows that as soon as n=pn = p, for some prime pp, every element (except for 00) has such an inverse. That is (Z/pZ)×={1,2,3,,p1}. (\mathbb{Z}/p\mathbb{Z})^{\times} = \lbrace 1, 2, 3, …, p-1 \rbrace. Hence, for every prime pp the set Z/pZ\mathbb{Z}/p\mathbb{Z} together with addition and multiplication fulfils the definition of a (finite) field: A set FF equipped with an addition operation ++ and a multiplication operation \cdot is called a field, whenever (F,+)(F, +) is an abelian group (with neutral element 00), (F{0},)(F\setminus{\lbrace 0\rbrace}, \cdot) is an abelian group (with neutral element 11) and a(b+c)=ab+ac(a+b)c=ac+aca\cdot(b + c) = a\cdot b + a\cdot c \\ (a + b)\cdot c = a\cdot c + a\cdot c for any a,b,cFa,b,c \in F. So Z/pZ\mathbb{Z}/p\mathbb{Z} is a finite field with pp elements. We typically denote this field by Fp\mathbb{F}_p or GF(p)GF(p) for Galois field (after Évariste Galois, who first introduced finite fields and died in a duel during the French revolution of 1830).

In addition to the case described above, for each prime pp and every positive integer k>1k > 1 there exists a prime field of order pkp^k. Constructing a finite field of order pkp^k boils down to finding an irreducible polynomial PP of degree kk over Fp[x]\mathbb{F}_p[x] and forming the quotient Fp[x]/P\mathbb{F}_p[x]/P. We are going to describe this procedure more in depth in the next article. Here we are only concerned with the more boring fields obtained from integers.

2. Python Implementation

2.1 Field methods: Add, Sub, Mult, Div

In this article we are going to focus on implementing a convenience class for finite field arithmetic. While our implementation is going to be independent of the exact prime order of the field, we will be working over the prime field GF(q)GF(q) with qq set to

q = 21888242871839275222246405745257275088696311157297823662689037894645226208583.

We have chosen this modulus because the resulting finite field is the basis for the elliptic curves we will implement later. We start by implementing our class by defining:

class GFq():
    def __init__(self, n):
        self.n = n % q

The idea here is that by wrapping a number in the GFq class automatically reduces it modulo qq using Python’s built-in modulo function %. Next, we are going to implement the field operations, i.e. addition, subtraction, multiplication, division as well es exponentiation. Let’s start with addition. One approach would be to define addition as follows.

def add(self,other):
    return GFq(self.n + other.n)

This takes two instances of our class and adds their n variables, again wrapping the number in GFq. This approach meets our requirements, but it does not allow us to write field addition using infix notation. Infix notation is the common way of writing an operator between two operands, e.g. 2 + 5. Instead we would have to write GFq(2).add(GFq(5)). Fortunately, Python allows us to overload its built-in operators. This can be achieved by overwriting them in a class. The names of these methods start and end with double underscores. Therefore, we need to alter our method in the following way:

def __add__(self,other):
    return GFq(self.n + other.n)

Now we can write GFq(2) + GFq(5) and obtain a correct result. The subtraction and multiplication methods follow the same layout.

def __mul__(self, other):
    return GFq(self.n * other.n)
def __sub__(self, other):
    return GFq(self.n - other.n)

Now, since division in prime fields is defined as ab1a\cdot b^{-1} we need a utility function to find the inverse of any element. The right tool for this task is the extended Euclidean algorithm. For now, we can just copy/paste the pseudo-code from Wikipedia and define a new function as follows.

def calc_inverse(a, q):
    """Use the extended Euclidean algorithm to compute the multiplicative inverse"""
    (t, newt) = (0, 1)
    (r, newr) = (q, a)

    while newr != 0:
        quotient = r // newr
        (t, newt) = (newt, t - quotient * newt)
        (r, newr) = (newr, r - quotient * newr)
    if r > 1:
        error("a is not invertible!")
    if t < 0:
        t = t + q
    return t

Note, that this function is not a part of our class. With this we can implement the division operation:

def __truediv__(self, other):
    return GFq(self.n * calc_inverse(other.n, q))
def __div__(self, other):
    return __truediv__(self, other)

We overwrite both truediv and div with the same implementation to be able to use the operators / (truediv) and // (div = floor division) interchangeably. The last operation we need to implement is exponentiation.

2.2 Exponentiation by Squaring

The naive approach would be to implement exponentiation simply by successively multiplying the base by itself. However, this yields an inefficient O(n)\mathcal{O}(n) algorithm and with a little work we can do better. As an example, let us compute 4118(mod1000)4^{118} \pmod{1000}. First, write down the binary expansion of the exponent: 118=21+22+24+25+26118 = 2^1 + 2^2 + 2^4 + 2^5 + 2^6 Now, we can substitute this back to obtain: 4118=421+22+24+25+26=421422424425426 4^{118} = 4^{2^1 + 2^2 + 2^4 + 2^5 + 2^6} = 4^{2^1} \cdot 4^{2^2} \cdot 4^{2^4} \cdot 4^{2^5} \cdot 4^{2^6} The resulting powers 42i4^{2^i} are easily computed as each can be obtained by squaring the preceding power. That is 422=(421)2=42214^{2^2} = (4^{2^1})^2 = 4^{2\cdot2^1}, 423=(422)24^{2^3} = (4^{2^2})^2, etc. We can collect the powers in a table like the following.

ii0123456
42i(mod1000)4^{2^i} \pmod{1000}416256536296616456

Using the table we can easily compute the desired quantity: 411842142242442542616256296616456736(mod1000).4^{118} \equiv 4^{2^1} \cdot 4^{2^2} \cdot 4^{2^4} \cdot 4^{2^5} \cdot 4^{2^6} \equiv 16 \cdot 256 \cdot 296 \cdot 616 \cdot 456 \equiv 736 \pmod{1000}. To create the table we needed 6 multiplications and to get the final result we needed another 4 multiplications, i.e. 10 multiplications in total. If we had followed the naive approach we would have had 117 multiplications. More generally, if we want compute gn=gn020+n121+n222++nr2rg^n = g^{n_0\cdot 2^0 + n_1\cdot 2^1 + n_2 \cdot 2^2 + … + n_r \cdot 2^r} we need rr multiplications to calculate all powers up to g2rg^{2^r}. In order to calculate the final result, we need at most another rr multiplications. Since n2rn \geq 2^r we have log2(n)r\log_2(n) \geq r and so we obtain a time complexity of O(log2(n))\mathcal{O}(\log_2(n)), i.e. the number of bits in the binary representation of nn. This algorithm is known as The Fast Powering Algorithm or Exponentiation by Squaring.

Instead of calculating a table, we implement the algorithm recursively by exploiting the following relationship. xn={x(x2)n12if n is odd(x2)n2if n is evenx^n = \begin{cases} x(x^{2})^{\frac {n-1}{2}} &\text{if nn is odd}\\ (x^{2})^{\frac {n}{2}} &\text{if nn is even} \end{cases}

Using this formula we can recursively remove the least significant digit of the binary representation of nn. The full implementation looks like this:

def __pow__(self, other):
    """Implements exponentiation by squaring"""
    if other == 0:
        if self != GFq(0):
            return GFq(1)
        else:
            error("undefined: 0^0")
    elif other == 1:
        return GFq(self.n)
    elif other % 2 == 0:
        return (self * self) ** (other //2)
    elif other % 2 == 1:
        return self * (self * self) ** ((other - 1) // 2)

2.3 Overloading Equality and Printing

Finally, we want to be able to determine if two elements of our class represent the same element of GF(q)GF(q). For this we follow the same approach as with the other class methods. We overload == by rewriting the __eq__ method.

def __eq__(self, other):
    """Overrides the default implementation"""
    if isinstance(other, GFq):
        return self.n == other.n
    return NotImplemented

The function isinstance(obj, class) returns true if and only if the supplied object obj is an instance of class. Internally, we reduce the problem to checking equality of two integers. Starting with Python3 this implementation already allows us to use != on objects of GFq as well.

Lastly, to aid our intuition we overwrite __str__ to be able to use print():

def __str__(self):
    return str(self.n)