Disjoint Set Union

Disjoint Set Union (DSU)orUnion-Findis a graph algorithm that is very useful in situations when you have to determine the connected components in a graph.Suppose that we haveNnodes numbered from 1 toNandMedges. The graph can be disconnected and may have multiple connected components. Our job is to find out how many connected components are there in the graph and the number of nodes in each of them.The basic idea behind DSU is the following:Initially, all nodes are isolated i.e. there are no edges in the graph. We add edges to the graph one by one.While adding an edge, check if the two nodes that the edge connects are already present in the same connected component.- if they are, then do nothing.- if they aren’t, then make the smaller of the two connected components a part of the larger one.(So, we are making union of disjoint connected components; therefore the name Disjoint Set Union)To keep track of what connected component a node belongs to, we use an array namedparent[ ].parent[i] tells the ID of the connected component that theith node belongs to. The ID of a connected component is one of the nodes in that connected component. This node is kind of theleaderorparentof all other nodes in that connected component.Initially, as all nodes are isolated, we haveNconnected components; each node being the leader of their connected component. Therefore,parent[i]=ifor all 1<=i<=N.To keep track of the size of a connected component, we use thesize[ ] array.size[i] = the number of nodes in the ith connected component.Initially,size[i] =1, for all 1<=i<=N. This is because, initially, all connected components contain only one node.When we encounter an edge that connects two nodesaandbthat belong to different connected components, we first check which of the two connected components is bigger: the one thatabelongs to or the one thatbbelongs to. The smaller connected component becomes part of the larger one. This is done to reduce the number of nodes whoseparenthas to be changed.Notice that we used the function callfindParent(i) to find the parent of theith node, instead of directly looking atparent[i]. The reason for this is:Theparentof a node is not changed as soon as its affiliation to a connected component is changed. We postpone this to when we actually need to find theparentof the node. Doing this avoids many useless operations. So,parent[i] may not contain the updated value of the connected component thatibelongs to. That’s why it’s important that we usefindParent(i) instead of being lazy and taking the value directly fromparent[i].In the end, we need to consider those nodesithat havefindParent(i)==i. This is because, these are the nodes that still belong to their initial connected component and were not assigned to a different one during execution. These represent thedisjointconnected components we are looking for.So the completecode for DSUis as follows:#include<stdio.h>
#define MOD 1000000007
int findParent(int i,int parent[])
//function to find the connected component that the ith node belongs to
{
if(parent[parent[i]]!=parent[i])
parent[i]=findParent(parent[i],parent);
return parent[i];
}
void unionNodes(int a,int b,int parent[],int size[])
//to merge the connected components of nodes a and b
{
int parent_a=findParent(a,parent),parent_b=findParent(b,parent);
if(parent_a==parent_b)
return;
if(size[parent_a]>=size[parent_b])//the larger connected component eats up the smaller one
{
size[parent_a]+=size[parent_b];
parent[parent_b]=parent_a;
}
else
{
size[parent_b]+=size[parent_a];
parent[parent_a]=parent_b;
}
return;
}

int main()
{

int N,M,i,a,b;
scanf(” %d %d”,&amp;N,&amp;M);
int parent[100001]={0},size[100001]={0};
for(i=1;i<=N;i++)
{
parent[i]=i;
size[i]=1;
}

for(i=1;i<=M;i++)
{
//scan each edge and merge the connected components of the two nodes
scanf(” %d %d”,&amp;a,&amp;b);
unionNodes(a,b,parent,size);
}

for(i=1;i<=N;i++)
printf(“Node %d belongs to connected component %dn”,i,findParent(i,parent));
long long ways=1;
int nos=0;
for(i=1;i<=N;i++)
{
if(findParent(i,parent)==i)//this condition is true only for disjoint connected components
{
printf(“%d leader %d sizen”,i,size[i]);
nos++;
}

}
printf(“Total connected components : %d”,nos);

return 0;
}Comparison between DFS and DSU:The task that DSU achieves inthiscode can be done using DFS as well. You should to code the same using DFS too.Though, keep in mind that DFS is not a replacement for DSU. DFS works well in cases when all edges are present in the graph from the beginning. But, in problems where edges are added during the execution and we need to run connectivity queries in between such additions, DSU is the better option. An example of this type of situation is the Kruskal’s algorithm to find the Minimum Spanning Tree (MST).In Kruskal’s algorithm, before adding an edge to the MST we need to check if the addition of the edge creates a cycle or not. We can use DSU here. If theparents of the two nodes that the edge connects are same, then we know that addition of the edge will create a cycle.Try implementing Kruskal’s algorithm for MST using DSU by yourself. It’s quite simple once you know DSU.

News Reporter

Leave a Reply

Your email address will not be published. Required fields are marked *