Sparse table

IntroductionImagine you are watching a movie and your friend assures you there is nothing interesting between 00:10 and 01:01 (hh:mm format). Also, he/she retells you the story from that time span. Now you don’t need to watch it – and you’ve just saved 51 minutes of your precious time! Moreover, you can continue to watch right from the interesting part.Now imagine you have many friends. Each of them has watched some portion of the film, and if combined, they’ve watched the entire movie. Then you don’t have to go to the movie theater, provided that your friends are eloquent enough. You just ask them about scenario, and they retell it to you, piece by piece.After this lyrical digression, let’s move to our competitive programming world. Sparse table is a data structure. It often serves as a substitute for segment tree in case of immutable data.Say you have an arrayArrand you want to perform some queries. Each query should compute functionFover subarray [L, R]:F(ArrL, ArrL + 1, …, ArrR). With sparse table, you can doeach query in O(log(N))(N is the size of Arr), withinitial O(N * log(N)) preprocessing.Sparse table can be applied if and only if:Arr isimmutable(i.e. queries do not change it);Function F isassociative: F(a, b, c) = F(F(a, b), c) = F(a, F(b, c)).Sparse table is easy to code and it is quite fast. Still the problems with completely immutable data are somewhat rare. I find sparse table one of my personal favourite approaches.The methodLet’s go back to the preface about friends and movies. It would be very convenient to have such friends. They would retell you needed information, and you just have to combine it to get the “big picture”.Imagine an array Table, such thatTable[i][j] = F(Arri, Arri+1, …, Arri + 2j- 1).You can view every cell of that array as a “friend”, who has computed function F over small portion of the array. Each portion has a size, which is apower of two. We will show later how it matters, and now let’s take a look at a more or less specific example.Suppose you are asked to compute F(Arri, Arri + 1, …, Arri + 12). Then you can do the following:Ask friend at Table[i][3] about F(Arri, Arri + 1, …, Arri + 7). Call this value x1;Ask friend at Table[i + 8][2] about F(Arri + 8, Arri + 9, …, Arri + 11). Call this value x2;Ask friend at Table[i + 12][0] about F(Arri + 12). Call this value x3;Compute F(x1, x2, x3) to get the answer.Notice that it took only 4 actions to compute F over subarray of size 13. However, in this case you knew which friends to ask in order to minimize overall number of questions.You may already know that every non-negative integer number hasbinary representation(in numeral system base 2). For example:1310= 110121510= 111126810= 1000100219810= 110001102This representation is unique for any given number. In other words, if you have numbers 20, 21, …, 2k(for somek≥ 1, each number is presented once), then you can representanyinteger N up to 20+ 21+ … + 2k= 2k + 1- 1 as the sum of some of those numbers.This can be done with the following pseudocode:for i = k..0:
if N >= 2^i: // 2^i means “two to the power of i”
N -= 2^i
print(‘add 2^i’)The strategy here is to greedily pick highest i such that 2i≤ N and subtract it from N. It works, because if we represent N as a k+1-bit binary number, then we add precisely those i, where i-th bit is set.For example, take N = 113 and k = 7. Then N represented as k+1=8-bit binary number is 01110001. We iterate from i = 7 to 0:i = 7, N = 113 = 01110001. Is 113 ≥ 27= 128? No, then do nothing;i = 6, N = 113 = 01110001. Is 113 ≥ 26= 64? Yes, then we set N = N – 64 = 113 – 64 = 49;i = 5, N = 49 = 00110001. Is 49 ≥ 25= 32? Yes, then we set N = N – 32 = 49 – 32 = 17;i = 4, N = 17 = 00010001. Is 17 ≥ 24= 16? Yes, then we set N = N – 16 = 17 – 16 = 1;i = 3, N = 1 = 00000001. Is 1 ≥ 23= 8? No, then do nothing;i = 2, N = 1 = 00000001. Is 1 ≥ 22= 4? No, then do nothing;i = 1, N = 1 = 00000001. Is 1 ≥ 21= 2? No, then do nothing;i = 0, N = 1 = 00000001. Is 1 ≥ 20= 1? Yes, then we set N = N – 1 = 1 – 1 = 0.The end. We conclude that 113 = 26+ 25+ 24+ 20.You can notice that we do not need to write down binary representation of N to make the algorithm work. It just serves as an illustrative proof.We can use the above reasoning tocompute F(ArrL, ArrL + 1, …, ArrR)for given L and R. We query some of our friends in the following manner:answer = ZERO
L’ = L
for i=k..0:
if L’ + 2^i – 1 <= R:
answer = F(answer, Table[L’][i]) // F is associative, so this operation is meaningful
L’ += 2^iZEROisneutral elementfor calculating function F, such that F(ZERO, x) = F(x, ZERO) = x for any x.In the above code we assume thatR – L + 1 < 2k + 1(we can choose such k, given bounds on N).The reasoning behind condition“L’ + 2i- 1 ≤ R”is such: we want to know, whether we should ask friend at Table[L][i] for help. Heshe accounts for subarray of 2ielements: ArrL, ArrL + 1, …, ArrL + 2i- 1. If that subarray lies in [L, R], then we reach out to mentioned friend.To ensure the algorithm is accurate, notice that on every iteration offorloop we haveanswer = F(ArrL, ArrL + 1, …, ArrL’ – 1)(with the exception that if L’ = L then answer is ZERO).Also note that at every step we add 2ionly in those cases, where R – L + 1 (size of subarray from L to R) hasi-th bit setin binary representation. That means that we’re adding a total of R – L + 1 to L’, which is initially equal to L.This implies that after loop ends, we have L’ = L + (R – L + 1) = R + 1. Thus, in the end
answer = F(ArrL, ArrL + 1, …, ArrR).This algorithm for answering queries with Sparse Table works in O(k), which isO(log(N)), because we choose minimal k such that 2k+1> N.Now we know how to answer queries if we have Table at hand. Buthow do we get it? This is the easiest way:for i=0..N-1: // assuming Arr is indexed from 0
Table[i][0] = F(Arr[i])
for j=1..k: // assuming N < 2^(k+1)
for i=0..N-2^j:
Table[i][j] = F(Table[i][j – 1], Table[i + 2^(j – 1)][j – 1])Here we are building Table[i][j] only for i, j such that i + 2j< N. We use the fact that 2j= 2j – 1+ 2j – 1: if we know Table[i][j] for fixed j and all meaningful i, then we can derive Table[r][j + 1] for every meaningful r. How do we do that?Take some r, such that r + 2j+1< N. Suppose we want to obtain Table[r][j + 1] = F(Arrr, Arrr + 1, …, Arrr + 2j + 1- 1).Split the subarray into two parts:P1= Arrr, Arrr + 1, …, Arrr + 2j- 1of size (r + 2j- 1) – r + 1 = 2jandP2= Arrr + 2j, Arrr + 2j+ 1, …, Arrr + 2j + 1- 1of size (r + 2j + 1- 1) – (r + 2j) + 1 = 2j + 1- 2j= 2j.Note thatF(Arrr, Arrr + 1, …, Arrr + 2j + 1- 1) = F(F(P1), F(P2))by associativity of F.Since sizes of P1and P2are both powers of two, then we can find F(P1) in Table[r][j] and F(P2) in
Table[r + 2j][j]. ThenTable[r][j + 1] = F(Table[r][j], Table[r + 2j][j]), which is expressed in the inner loop above.Outer loop runs in O(k), inner loop runs in O(N). Thus, in total we get O(N * k) =O(N * log(N))time complexity for Sparse Table creation.Example problemsIn this section we’ll describe some well-known problems and solutions to them using Sparse Table.Remember:to solve a problem with Sparse Table, you must ensure thatQueries do not change the data;Function F is associative.If this is the case, the implementation will usually look as follows:Read the data;Build sparse table;Read queries one-by-one and answer them immediately.Range sum queryProvided with an integer array Arr of length N, answer Q queries:given L and R, 0 ≤ L, R < N, findArrL+ ArrL + 1+ … + ArrR.N and Q
are up to 105, |Arri| ≤ 109.Here we haveF(x, y) = x + y, with the neutral element ZERO being actual zero (i.e.0).We set k to 16, because 217is larger than 105(highest possible N), while 216is not.The solution would look something like this (written in C++):const int k = 16;
const int N = 1e5;
const int ZERO = 0; // ZERO + x = x + ZERO = x (for any x)

long long table[N][k + 1]; // k + 1 because we need to access table[r][k]
int Arr[N];

int main()
{
int n, L, R, q;
cin >> n; // array size
for(int i = 0; i < n; i++)
cin >> Arr[i];

// build Sparse Table
for(int i = 0; i < n; i++)
table[i][0] = Arr[i];
for(int j = 1; j <= k; j++) {
for(int i = 0; i <= n – (1 << j); i++)
table[i][j] = table[i][j – 1] + table[i + (1 << (j – 1))][j – 1];
}

cin >> q; // number of queries
for(int i = 0; i < q; i++) {
cin >> L >> R; // boundaries of next query, 0-indexed
long long answer = ZERO;
for(int j = k; j >= 0; j–) {
if(L + (1 << j) – 1 <= R) {
answer = answer + table[L][j];
L += 1 << j; // instead of having L’, we increment L directly
}
}
cout << answer << endl;
}
return 0;
}Notice that in C++ expression (1 << j) means 2j.Range minimum querySame as previous problem, but the query asks formin(ArrL, ArrL + 1,
…, ArrR).Here we haveF(x, y) = min(x, y). The neutral elementZERO is 109+ 1, because the problem tells us that |Arri| ≤ 109, and therefore any possible element of the array is less than the neutral element. This implies that min(x, ZERO) = min(ZERO, x) = x for any x that we can see in Arr.Implementation (again in C++):const int k = 16;
const int N = 1e5;
const int ZERO = 1e9 + 1; // min(ZERO, x) = min(x, ZERO) = x (for any x)

int table[N][k + 1]; // k + 1 because we need to access table[r][k]
int Arr[N];

int main()
{
int n, L, R, q;
cin >> n; // array size
for(int i = 0; i < n; i++)
cin >> Arr[i]; // between -10^9 and 10^9

// build Sparse Table
for(int i = 0; i < n; i++)
table[i][0] = Arr[i];
for(int j = 1; j <= k; j++) {
for(int i = 0; i <= n – (1 << j); i++)
table[i][j] = min(table[i][j – 1], table[i + (1 << (j – 1))][j – 1]);
}

cin >> q; // number of queries
for(int i = 0; i < q; i++) {
cin >> L >> R; // boundaries of next query, 0-indexed
int answer = ZERO;
for(int j = k; j >= 0; j–) {
if(L + (1 << j) – 1 <= R) {
answer = min(answer, table[L][j]);
L += 1 << j; // instead of having L’, we increment L directly
}
}
cout << answer << endl;
}
return 0;
}Range greatest common divisor queryProvided with an integer array Arr of length N, answer Q queries:given L and R, 0 ≤ L, R < N, findgcd(ArrL, ArrL + 1, …, ArrR).N and Q
are up to 105, 1 ≤ Arri≤ 109.Heregcdstands for “greatest common divisor”, i.e. gcd(x1, x2, …, xp) = y if and only if:xiis divisible by y for every 1 ≤ i ≤ p;y is the greatest among numbers, which satisfy the above property;if xi= 0 for every 1 ≤ i ≤ p, then y is undefined.Here we haveF(x, y) = gcd(x, y). There is an algorithm for finding gcd of two numbers (called Euclid’s algorithm), so you are not obligated to know how it is computed. C++ has this algorithm implemented in function__gcd(x, y).What about gcd(x, y, z)? How do we obtain that?It turns out that gcd(x, y, z) = gcd(gcd(x, y), z) = gcd(x, gcd(y, z)). Intuitively, this happens because no matter how you order operands and take gcd-s, the greatest common divisor is still the same. Since now we know our function is associative, we can plug-in Sparse Table to solve the problem.One more thing to notice:ZERO = 0, because 0 is divisible by any other number, therefore gcd(0, x) = gcd(x, 0) = x for x > 0.Implementation (C++):const int k = 16;
const int N = 1e5;
const int ZERO = 0; // gcd(ZERO, x) = gcd(x, ZERO) = x (for any x > 0)

int table[N][k + 1]; // k + 1 because we need to access table[r][k]
int Arr[N];

int main()
{
int n, L, R, q;
cin >> n; // array size
for(int i = 0; i < n; i++)
cin >> Arr[i]; // between 1 and 10^9

// build Sparse Table
for(int i = 0; i < n; i++)
table[i][0] = Arr[i];
for(int j = 1; j <= k; j++) {
for(int i = 0; i <= n – (1 << j); i++)
table[i][j] = __gcd(table[i][j – 1], table[i + (1 << (j – 1))][j – 1]);
}

cin >> q; // number of queries
for(int i = 0; i < q; i++) {
cin >> L >> R; // boundaries of next query, 0-indexed
int answer = ZERO;
for(int j = k; j >= 0; j–) {
if(L + (1 << j) – 1 <= R) {
answer = __gcd(answer, table[L][j]);
L += 1 << j; // instead of having L’, we increment L directly
}
}
cout << answer << endl;
}
return 0;
}Number of contiguous subarrays with gcd equal to 1This problem itself is not about queries, but Sparse Table is still useful here. This is a more tricky application of the method, and might not be suitable for beginners.You have an array of integers Arr of length N. You must count number of pairs of integers (L, R) such that:0 ≤ L ≤ R < N;gcd(ArrL, ArrL + 1, …, ArrR) = 1.N is up to 106.Let’s start with some observations:If we brute-force every pair (L, R) satisfying restriction #1, it will take roughly N * (N – 1) / 2 operations, which is about 5 * 1011. It’s too much to process in reasonable online-judge time;Let g = gcd(x, y), f = gcd(x, y, z). Then f ≤ g, because f = gcd(g, z), and greatest common divisor of two integers can not be greater than any of them;If we fix L and compute values xL= gcd(ArrL), xL + 1= gcd(ArrL, ArrL + 1), …, xN – 1= gcd(ArrL, ArrL + 1, …, ArrN – 1), then we have xL≥ xL + 1≥ … ≥ xN – 1(it follows from the above point).Hence, for each fixed L we can apply binary search to findsmallest R ≥ Lsuch that gcd(ArrL, ArrL + 1, …, ArrR) = 1, because gcd is monotonous (subject to increasing the subarray). After we found such R, we know for sure that further extending subarray to the right provides gcd equal to 1. We count all such pairs (L, R’) with N > R’ ≥ R. There are N – R of them. This will takeO(N * log(N) * log(N))time: one log(N) for binary search, other log(N) for inner sparse table computations.Instead of binary searching we will maintain smallest such R similar to how we count gcd on a segment.Be sure to understand how querying Sparse Table work in general before you continue reading.Firstly, we set R = L;Then for i=k..0 we decide whether we need to move R further by 2iunits;Why would we need to move it at all? Since we are ultimately looking for R to satisfy gcd(ArrL, ArrL + 1, …, ArrR) = 1, we are not interested if that gcd is greater than 1 (notice that it can not be less that 1 in any situation);And if the latter is the case, i.e. we found out that gcd(ArrL, ArrL + 1, …, ArrR, ArrR + 1, …, ArrR + 2i- 1) > 1, then we need to increase R by 2i, because gcd(ArrL, ArrL + 1, …, ArrR) > 1 for sure (remember, we have monotonous function). After that addition we continue to loop further. In the end we will obtain the desired R;This will work in O(log(N)) for fixed L, getting us toO(N * log(N))solution in total.See the implementation for details:const int k = 16;
const int N = 1e5;
const int ZERO = 0; // gcd(ZERO, x) = gcd(x, ZERO) = x (for any x > 0)

int table[N][k + 1]; // k + 1 because we need to access table[r][k]
int Arr[N];

int main()
{
int n;
cin >> n; // array size
for(int i = 0; i < n; i++)
cin >> Arr[i]; // between 1 and 10^9

// build Sparse Table
for(int i = 0; i < n; i++)
table[i][0] = Arr[i];
for(int j = 1; j <= k; j++) {
for(int i = 0; i <= n – (1 << j); i++)
table[i][j] = __gcd(table[i][j – 1], table[i + (1 << (j – 1))][j – 1]);
}

// main part of the solution
long long answer = 0;
for(int i = 0; i < n; i++) {
int R = i; // we will move R forward as long as gcd(Arr_i, Arr_i+1, …, Arr_R) != 1
// or until R reaches n.

int g = ZERO;
for(int j = k; j >= 0; j–) {
if(R + (1 << j) – 1 >= n)
continue; // we do not want to exceed array size

if(__gcd(g, table[R][j]) > 1) {
// Even if we add 2^j more values, gcd is still > 1. Therefore,
// we move R forward and update gcd appropriately.
g = __gcd(g, table[R][j]);
R += 1 << j;
}
}

// In the end, either R = n or gcd(Arr_i, Arr_i+1, …, Arr_R) = 1.
answer += n – R;
}

cout << answer << endl;
return 0;
}AfterwordThe application of Sparse Table is not limited to arrays only. For example, you can use it on trees to findLowest Common Ancestorin O(log(N)) per query. However, this is slightly off-topic (because knowledge of graphs, trees and depth-first search is needed) and I decided not to include it in this article.

News Reporter

Leave a Reply

Your email address will not be published. Required fields are marked *