# Data Structures & Algorithms in Java – Graphs – Valid Tree

## Problem:

Given the number of nodes in a graph and the list of edges in the graph , find if it is a valid tree.

For example,

Given n = 5 and the edges [[0,1] ,[0,2],[0,3],[3,4]

It is a valid tree because there are no cycles in it and every node is connected with an edge.

For the below graph:

n = 5

`edges = [[0,1],[1,2],[2,3],[1,3],[1,4]]`

The tree is invalid because there is a cycle in the graph:

Try out the solution here:

https://leetcode.com/problems/graph-valid-tree/

## Solution:

Hint:

Use DSU (Disjoint Set Union) or DFS (Depth First Search) Algorithms

Approach 1 – Union Find / Disjoint Set Union

To detect if a graph is a cycle,

we need to make sure two conditions are met:

1. All the edges are connected
2. There is no cycle

Also if both these conditions match , then the number of edges equals the number of nodes!

So if we check any of the above conditions AND if the number of edges equals the number of nodes then we can detect if a tree is valid or not.

The other condition will automatically match.

That is ,

If the number of edges equals the number of nodes

AND

If all the edges are connected

THEN

A cycle cannot exist.

Also,

If the number of edges equals the number of nodes

AND

If there is no cycle

THEN

All the edges should be connected!

Let’s use the second set of conditions in this approach:

We will check if the number of edges equals the number of nodes

AND

If there is no cycle.

We will use Union Find / Disjoint Set Union algorithm for the same.

Checking if the number of edges equals the number of nodes is easy and you do that in the first statement of the code:

```        if(edges.length != n-1) return false;
```

One that conditions is checked, check if there is any cycle.

To do that,

We will first consider each node as separate islands.

We will find the root node of each node.

Initially this is set to itself.

If there is an edge between nodes then we will make one of them as root node of both.

To check if there is a cycle we see if the roots of two nodes in an edge are the same.

If so there is a cycle.

While comparing the root nodes we find out the topmost root node of each node.

We find this by finding the root of a node ,

then the root of the root node ,

then the root of the root of the root node and so on ,

until the node itself is not its root

Here is the code to find the root:

```  public int find(int node,int[] roots){

while(roots[node] != node){

roots[node] = roots[roots[node]];

node = roots[node];
}

return node;
}
```

The line:

```            roots[node] = roots[roots[node]];
```

just accelerates the root finding process.

If the roots of two edges match then it means there is a cycle between them.

So it cannot be a valid tree and you return false.

Here is the entire Code:

```class Solution {
public boolean validTree(int n, int[][] edges) {

if(edges.length != n-1) return false;

int[] roots = new int[n];

for(int i=0;i<n;i++) roots[i] = i;

for(int[] e:edges){

int root1 = find(e,roots);

int root2 = find(e,roots);

if(root1 == root2){

return false;
}

roots[root1] = root2;

}
return true;
}

public int find(int node,int[] roots){

while(roots[node] != node){

roots[node] = roots[roots[node]];

node = roots[node];
}

return node;
}
}
```

This solution was faster than other solutions on leetcode when tested:

Algorithm:

1. If number of edges does not equal to the number of nodes return false
2. Create an array of root nodes and initialize to the value of the node itself for each position.
3. For each node in an edge , find the root nodes,
4. If they match then there is a cycle , so return false
5. If they don’t match then make the last node of the edge as the root node of the first one. This way they get connected.
6. Return true .

Time complexity:

Initializing the root nodes take O(n) time where n is the number of nodes

Finding the top most root of each node takes α(n) time which is called inverse ackermann function. Basically it is a function which grows very slowly in proportion to the input size.

Since finding the root nodes doesn’t grow in complexity in the same proportion as the input we denote it by this function.

For every edge we find the root node , so time complexity is O(e * α(n)) where e is the number of edges.

Total time complexity is O( n + e* α(n))

Space complexity:

The root node array takes O(n) space.

There is no other extra space complexity.

So space complexity is O(n)

Approach 2 – DFS

In this approach ,

you start from the first node and do DFS on the graph.

If after doing DFS any of the nodes are left unvisited then it means all the nodes are not connected.

Also check if the number of nodes is equal to the number of edges.

If both conditions return false, it is not a valid tree.

So we check for these two conditions:

• Number of edges equals number of nodes
• All nodes are connected

These two conditions automatically imply that there is no cycle in the graph which is a property of a valid tree.

To do DFS , first you need a node and its successors.

So we will use the data structure – Adjacency list

Here is the code to prepare the adjacency list:

```

for(int i=0;i< n;i++){

}

for(int[] e:edges){

}

```

Here is the DFS code which marks all visited nodes as true:

``` public void dfs(List<Integer>[] a,boolean[] v,int n){

v[n] = true;

List<Integer> aNodes = a[n];

for(int node: aNodes){

if(!v[node]){

dfs(a,v,node);
}

}

}
```

Here is the entire code:

```class Solution {
public boolean validTree(int n, int[][] edges) {

if(edges.length != n-1) return false;

for(int i=0;i< n;i++){

}

for(int[] e:edges){

}

boolean[] visited = new boolean[n];

for(boolean b:visited){

if(!b) return false;
}

return true;

}

public void dfs(List<Integer>[] a,boolean[] v,int n){

v[n] = true;

List<Integer> aNodes = a[n];

for(int node: aNodes){

if(!v[node]){

dfs(a,v,node);
}

}

}

}
```

Algorithm:

1. If number of edges does not equal number of nodes return false
2. Prepare adjacency list using the given edges
3. Create a bitmap to keep track of visited nodes.
4. Do DFS starting from the first node and mark all visited nodes as true in the process
5. Check if any of the nodes is left unvisited , then return false
6. Return true.

Time complexity:

It takes O(n) time to prepare the adjacency list

O(n) time to initialize the bitmap to keep track of visited nodes

O(n) time for the DFS call since it visits every node exactly once.

In total O(3n) which can be considered as O(n) time

Space complexity:

To hold the adjacency list you need O(n) space

For the bitmap you need O(n) space

And for the the recursive call stack memory you need O(n) space

So total space complexity is O(n)

That’s it!

Posted

in

by

Tags: