Data Structure & Algorithms in Java – Graphs – Number of Connected Components in an Undirected Graph

Problem:

Given a list of edges of a undirected graph and the number of nodes in it , find out the number of connected components in the graph.

For example:

Given the number of nodes 5

And the edges:

[[0,1],[1,2],[3,4]]

The number of components is 2.

You can understand this if you visualize the final graph:

source: leetcode.com

As you see if you connect the nodes with the edges, you get two distinct components.

Try out the solution here:

https://leetcode.com/problems/number-of-connected-components-in-an-undirected-graph/

Solution:

Hint:

  • Use DFS
  • Use Disjoint Set Union algorithm (DSU)

Approach 1: DFS

You can use Depth First Search traversal to solve this problem.

The idea goes like this.

Start from a node and do DFS.

When there are no more edges the DFS stops and you can increment the count of components to 1 from 0.

Then pick another node which is not yet visited in previous DFS and do DFS again.

Increment the count.

Keep doing this until there are no more visited nodes.

This is the broad idea.

Data Structures needed:

To do DFS you need an adjacency list representing each node and its successors/connecting nodes.

Also we need a bitmap array of visited nodes so that we don’t pick the same node again.

Here are the data structures in code:

       boolean[] visited = new boolean[n];
        
        
        
        
        List<Integer>[] adjList = new ArrayList[n];
        
        for(int i = 0;i<n;i++){
            
            
            adjList[i]=new ArrayList<Integer>();
        }
        
        
        
        for(int[] edge:edges){
            
            
            adjList[edge[0]].add(edge[1]);
            adjList[edge[1]].add(edge[0]);
        }
        

Once we prepare these data structures , we can iterate through all unvisited nodes and do DFS incrementing the count each time:

Below is the core logic:

   
        int components = 0;
        
        for(int i=0;i<n;i++){
            
            
            if(!visited[i]){
                components++;
            
            dfs(visited,adjList,i);
            }
        }
        
        
        return components;

Here is the DFS algorithm in code:

   public void dfs(boolean[] visited,List<Integer>[] adjList,int node){
       
       
       visited[node] = true;
       
       
       List<Integer> adjacentNodes = adjList[node];
       
       
       for(int n:adjacentNodes){
           
           
           if(!visited[n]){
               
               
               dfs(visited,adjList,n);
           }
       }
       
       
   } 

Here is the entire code:

class Solution {
    
    
    public int countComponents(int n, int[][] edges) {
    
        boolean[] visited = new boolean[n];
        
        
        
        
        List<Integer>[] adjList = new ArrayList[n];
        
        for(int i = 0;i<n;i++){
            
            
            adjList[i]=new ArrayList<Integer>();
        }
        
        
        
        for(int[] edge:edges){
            
            
            adjList[edge[0]].add(edge[1]);
            adjList[edge[1]].add(edge[0]);
        }
        
        
        
        int components = 0;
        
        for(int i=0;i<n;i++){
            
            
            if(!visited[i]){
                components++;
            
            dfs(visited,adjList,i);
            }
        }
        
        
        return components;
    }
    
   public void dfs(boolean[] visited,List<Integer>[] adjList,int node){
       
       
       visited[node] = true;
       
       
       List<Integer> adjacentNodes = adjList[node];
       
       
       for(int n:adjacentNodes){
           
           
           if(!visited[n]){
               
               
               dfs(visited,adjList,n);
           }
       }
       
       
   } 
    
}

Algorithm :

  1. Prepare the data structures – adjacency list and bitmap of visited nodes
  2. For each unvisited node do DFS and increment a counter
  3. Return the counter

Time complexity:

For preparing the adjacency list we iterate through each edge.

So the time complexity is O(E) where E is the number of edges

We do DFS for each node in the graph

So time complexity is O(N) where N is the number of nodes.

Total time complexity is O( E + N )

Space complexity:

We use an adjacency list of size N, a bit map of size N and the DFS makes recursive calls N times in the worst case.

So space complexity is O(N) where N is the number of nodes.

Approach 2 – Disjoint Set Union

In this approach you follow the disjoint set union algorithm.

Initially you consider each node as a separate set containing one element.

They are all disconnected and so there are n components where n is the number of nodes.

Then as you traverse through the edges you keep connecting the nodes and find a root node for the connected nodes.

While doing so if the connected nodes have different roots you decrement the component size (as you have combined two components into one component now)

At the end you return the component size.

To find the root node:

You maintain a data structure to represent the root nodes : an array which is initialized to the root node values itself.

That is root[0] = 0 , root[1]= 1 and so on,

since initially all the nodes are disconnected.

Then as you connect two nodes, you pick the second node as the root node and assign it as the root of the first node:

root[0] = 1,

if the edge is [0,1], that is 0 and 1 nodes are connected.

To find the root of a node , you keep traversing the root of roots until you reach the top root element.

For example,

If the root of 0 is 1 and the root of 1 is 2 and the root of 2 is 3,

then the root of 0 is 3.

root[0] = 1 ,

root[1] = 2

root[2] = 3

root[0] = root[1] = root[2] = 3

Here is the core logic in code:

   int count = n;

  for(int[] edge:edges){
            
        
            int root1 = findRoot(edge[0],roots);
            
            int root2= findRoot(edge[1],roots);
            
            
            if(root1 != root2){
                
                
                roots[root1] = root2;
                
                count--;
            }
            
            
            
        }
        

You iterate through the edges , find the root nodes of each node in the edge.

If they are not equal then it means a new component has been discovered, so decrement the total components size by 1 (it is initialized to the total number of nodes initially).

Keep doing this for all the edges.

Here is the logic to find the root node:

    public int findRoot(int node,int[] roots){
        
        
        
        while(node != roots[node]){
           
            
            node = roots[node];
        }
        
        return node;
        
    }

As earlier explained , you keep traversing the root of roots until you reach the topmost root.

You can speed up this process by adding the below line in the while loop:

            roots[node] = roots[roots[node]];

Here is the updated code to find root:

    public int findRoot(int node,int[] roots){
        
        
        
        while(node != roots[node]){
            
            roots[node] = roots[roots[node]];
            
            node = roots[node];
        }
        
        return node;
        
    }

Here is the entire code:

class Solution {
    
    
    public int countComponents(int n, int[][] edges) {
    
      
        int[] roots = new int[n];
        
        for(int i=0;i<n;i++){
            
            
            roots[i] = i;
        }
        
       int count = n;
        
        
        for(int[] edge:edges){
            
        
            int root1 = findRoot(edge[0],roots);
            
            int root2= findRoot(edge[1],roots);
            
            
            if(root1 != root2){
                
                
                roots[root1] = root2;
                
                count--;
            }
            
            
            
        }
        
        
        return count;
        
    }
    
    
    public int findRoot(int node,int[] roots){
        
        
        
        while(node != roots[node]){
            
            roots[node] = roots[roots[node]];
            
            node = roots[node];
        }
        
        return node;
        
    }
    
}

Algorithm:

  1. Initialize an array of root nodes to the node value itself
  2. Initialize total number of components to the total number of nodes (vertices).
  3. For each edge , check the roots of each of the two nodes, if they are not equal , we can combine them to form a new component (two components become one component) so decrement the total number of components by 1
  4. Also assign the second node in the edge as the root node of the first node and they both represent a single component now
  5. Return the count after going through all the edges

Time complexity:

We iterate through the nodes to initialize the root node array . So time complexity here is O(N) where N is the number of nodes.

For every edge we find their roots.

Traversing through the edges takes O(E) time complexity, where E is the number of edges .

Finding the root node takes α(n) time where n is the number of nodes.

α(n) is called inverse ackermann function common to DSU algorithms.

Basically it grows very slowly with the input. That is as input size increases it doesn’t grow in proportion to the input at the same size. It grows very slowly.

So total time complexity is O(n + e * α(n) )

Space complexity:

To store the root nodes you need O(n) space

So space complexity is O(n).

That’s it!


Posted

in

by

Tags:

Comments

Leave a Reply

Discover more from The Full Stack Developer

Subscribe now to keep reading and get access to the full archive.

Continue reading