Kruskal’s Algorithm

I planned to write a serious of posts on working and analysis of some of the basic algorithms. I also planned to give link to a JAVA implementation of the same. I will start off this series by first describing the Kruskal’s Algorithm for finding the cost of the Minimum Spanning Tree in a graph.

Minimum Spanning Tree: A minimum cost spanning tree is a subgraph induced on the original graph such that the induced sub-graph is a tree on all the vertices on the original graph and the sum of weights of the edges on this graph is the minimum among all the possible spanning trees of the graph.

The basic intuition behind the Kruskal’s algorithm is that the least weighted edge of the original graph will appear in the minimum spanning tree. Hence, sorting the edges based on their weight might be a first step in building the spanning tree. Does that mean, the first n-1 least weight edges are sufficient to build the spanning tree (If n is the number of vertices in the graph) ? We also need to ensure that the edges picked for the spanning tree do not form a cycle. Hence, that is the second key observation. The two observations is formalized below as algorithm:

```Vector Edges = list of edges in the input graph
Sort the edge list based on their edge weight
Iterate through the sorted list:
If adding this edge to the MST does not create a cycle:
Add this edge to the MST
Else
Drop this edge and continue to the next edge in the list

```

The algorithm above can be easily implemented. The only difficult part of this is in line 4. To check if adding an edge will create a cycle. This can be done a subroutine call to a standard search algorithm in the graph, which can detect a cycle.

Let us now analyse the complexity of the above algorithm.

Line 2: O(|E| log( |E| )

Line 4: O( |E| + |V| )

Since Line 4 is repeated O(|E|) times, the overall complexity will be

O( |E|^2 + |E|*|V| + |E| log( |E| ) ) . And in the worst case,

|E| = O( |V|^2 ),therefore this implementation has a worst case complexity of O( |V|^4  ).

We can modify the above algorithm by using a disjoint-set data structure. The working of this data structure can be found here.

Here is the modified algorithm:

```Vector Edges = list of edges in the input graph
Sort the edge list based on their edge weight
Iterate through the sorted list of edge E as (u,v):
Let pu be the parent of vertex u in the disjoint set
Let pv be the parent of vertex v in the disjoint set
if pu !=pv:
Add the edge to the MST
Make parent of u = parent of v in the disjoint set data structure

```

Let us now analyse the complexity of this new algorithm.
Line 2: Sorting takes O( |E| log( |E| ) ). This is same as before

The entire loop takes an amortised complexity of O( |E| log( |V| ) ). This is because Line 4 and Line 5, uses the disjoint set find operation.

And this has an amortised complexity of

O( log( |V| ) ). Hence, the overall complexity of the loop is

O( |E| log( |V| ) ). And as before, if we consider the worst case complexity of

|E| = O( |V|^2 ), the algorithm takes O( |V|^2 log( |V| ) ) time.

Here is a link to the JAVA implementation of Kruskal’s Algorithm –