Euler Totient Function

In this post I will explain yet another simple number theoretic function, called the Euler Totient Function. This is a simple function which is used indirectly in many problems in programming contests. The function  is denoted as phi(n).  phi(n) basically gives the number of integers greater than 0 and lesser than n+1 which are co-prime to the integer n. In other words, it is the cardinality of the set

S = { a: 1<a<n , gcd(a,n) = 1 }. To calculate this function, we will use a straight forward brute force approach. With a simple analysis, we can show

that the running time of this algorithm is O(sqrt(n)).

- Start with an estimate of the totient function as n
- Iterate for i from 2 to sqrt(n)
- If i | n , we know that all factors of i less than equal
  to n will have gcd with n as atleast i. Hence, they can
 never be in the set S. Hence remove them from the result.
- While i|n, divide n by i.
- After the iteration of i from 2 to sqrt(n),
   if n is a number greater than 1, remove all
   its multiples less than n also from the result

The analysis of this code is similar to  the analysis for prime factorization. Essentially at the core of the algorithm, we are finding prime factors of n and removing all its multiples. Hence the numbers left are the ones that are co-prime to n.

Hence, both correctness and the running time analysis follows from the prime factorization algorithm.

Link to a java implementation.


Number Theoretic Algorithms – Factorization

In the following series of posts, I will write about some of the number theoretic algorithms that one encounters in most of competitive programming. These are fundamental algorithms which most computer scientist should know.

In this post I will post an algorithm and a corresponding JAVA code to perform prime factorization of a give number.

The algorithm to perform prime factorization is a very trivial algorithm and mostly encountered by everyone atleast once in their high school.

- Iterate for i from 2 to sqrt(n)
- While i|n, divide n by i. Report i as a prime factor of n
- After the iteration of i from 2 to sqrt(n),
   if n is a number greater than 1, report that also as a prime factor

The key observation is that, this algorithm runs in O(square root (n) ).

We are able to iterate only upto sqrt(n) since, we can claim one of the following:

1) n is a prime number


2) If n has a prime factor p greater than sqrt(n), then there exists a corresponding prime factor p’ lesser than or equal to sqrt(n), such that

p*p’ = n

If (1) doesn’t hold, it implies that n is a composite number. And Let us assume (2) doesnt hold.

This implies for a prime p such that p|n, there is no p’ <= sqrt(n) such that p’|n.

In particular n/p > sqrt(n) . And since p>sqrt(n)

=> 1/p < 1/sqrt(n)

=> n/p <n/sqrt(n) = sqrt(n).

This provides the contradiction. Hence the claim above proves the correctness of the algorithm.

Link to a JAVA implementation of the above factorization.

Floyd-Warshall ‘s All Pair Shortest Path Algorithm

In the previous two posts, I explained two different algorithms to find the shortest path in the graph. In this post I will define another very simple yet important problem in Graph Theory. Its called the “All Pair Shortest Path” problem.

Given a graph G, as a set of vertices V and directed weighted edges E, the aim of the problem is to find the cost of shortest path between every pair of vertices. Cost of a shortest path is defined as the sum of the edge weights in the shortest paths.

At first sight this problem might look really hard to approach. But the following theorem will make this one of the easiest to approach and especially code it.


If A represents the adjacency matrix of the graph, then for an integer k<=|V| , the (i,j) entry in the matrix A^k , gives the shortest path of length atmost k, between i and j in the original graph.

This above theorem can be easily proved using induction on k. I will leave that as a simple exercise.

From the theorem, all that is left to find is the matrix A^n.

A trivial way to compute this would be to do ‘n’ matrix multiplications. This would take a complexity of O(|V|^4).

Instead, Floyd and Warshall gave a Dynamic Programming based approach.

for k from 1 to n:
    for i from 1 to n:
        for j from 1 to n:
            A[i][j] = min(A[i][j],A[i][k]+A[k][j])

The above algorithm has a very simple intuition. The shortest path from i to j can be either the current value, or sum of shortest path from (i,k) and shortest path from (k,j). Instead of calculating this value repeatedly, we will use a dynamic programming approach and calculate once and store it.

The above algorithm has 3 loop each running for O(|V|) time. Hence the overall complexity is O(|V|^3).

A java implementation of the above algorithm can be found here.

Prim’s Algorithm

In the last post, I defined a spanning tree and gave an algorithm called the ‘Kruskal’s Algorithm’. In this post, I am going to describe another algorithm that also helps in computing the minimum spanning tree of a graph. This algorithm is called the ‘Prim’s Algorithm’ . The basic intuition behind the algorithm goes as follows: Firstly, every vertex should be reachable from every other vertex for it to be a tree. So we will try to build the tree by adding one vertex after another into the connected component. Since we want a “minimum” such tree, we will use the edge between the new vertex and the old component, that is of the minimum weight. This intuition is formalized below as an algorithm:

Set ConnectedSet = Pick a random vertex v of vertex set
Set ToBeAddedSet = set of vertices except vertex v
Set ListOfBridgeEdges = Set of Edges

while ToBeAddedSet not empty:
    - Select the minimum edge e from the ListOfBridgeEdges
       such that it has exactly one end in ConnectedSet

    - Add the other end of e to the ConnectedSet and remove
      it from the ToBeAddedSet

Let us now analyze the complexity of the above algorithm:

1) The outer while loop runs for O(|V|) times.

2) Inside each loop iteration, it takes O(|E|) to select the minimum weighted edge such that it has exactly one end in ConnectedSet.

Therefore the overall running time of this algorithm is O(|V| * |E|). And in the worst case |E| is O(|V|^2). Hence, the worst case running time will be O(|V|^3). We can now improve the above algorithm by using a different data structure to store the bridge edges. We will store the edges in a min priority queue(a heap structure).

- Initialise an empty min priority queue Q.
- Store a <key,value> pair in the queue,where key is the comparator for the queue.

- for all the vertices v in V:
    Initialise the key as infinity
    Initialise the value as NULL

- Choose a random vertex v from V.
  Initialise the key as 0.

- Push v into Q
- while Q is not NULL:
     Assign u as the extract minimum from priority queue Q
     for all v neighbour of u:
         if v is present in Q and edgeweight(u,w)< key value of v:
             Assign v->value as u
             Assign v->key as edgeweight(u,w)

Now using the predecessors information present in the v->value, we can build the MST.

Let us now analyse the complexity of this modified algorithm

1) The outermost while loop runs for O(|V|) iterations.

2) For each iteration, the extract minimum from the priority queue takes O(log |V|). Hence the complexity is O(|V| log |V| ) for L13.

3) To analyse the inner loop is slightly tricky. Notice that the total number of times the outer loop + the inner loop executes is exactly 2*|E|. Because each edge is counted twice. Once for each of its end vertices.

4) Each time inside the inner loop, the value of the key is potentially changed. And to insert this back in the priority queue it takes a complexity of O(log |V|). Hence the inner loop overall takes O(|E| log |V|).

Hence, with this modified algorithm, the overall complexity is O(|V| log |V| + |E| log |V|).  And for |E| = O(|V|^2), the algorithm runs in O(|V|^2 log |V|), which is a improvement over O(|V|^3).

A JAVA implementation of the Prim’s Algorithm can be found here.

Kruskal’s Algorithm

I planned to write a serious of posts on working and analysis of some of the basic algorithms. I also planned to give link to a JAVA implementation of the same. I will start off this series by first describing the Kruskal’s Algorithm for finding the cost of the Minimum Spanning Tree in a graph.

Minimum Spanning Tree: A minimum cost spanning tree is a subgraph induced on the original graph such that the induced sub-graph is a tree on all the vertices on the original graph and the sum of weights of the edges on this graph is the minimum among all the possible spanning trees of the graph.

The basic intuition behind the Kruskal’s algorithm is that the least weighted edge of the original graph will appear in the minimum spanning tree. Hence, sorting the edges based on their weight might be a first step in building the spanning tree. Does that mean, the first n-1 least weight edges are sufficient to build the spanning tree (If n is the number of vertices in the graph) ? We also need to ensure that the edges picked for the spanning tree do not form a cycle. Hence, that is the second key observation. The two observations is formalized below as algorithm:

Vector Edges = list of edges in the input graph
Sort the edge list based on their edge weight
Iterate through the sorted list:
    If adding this edge to the MST does not create a cycle:
        Add this edge to the MST
       Drop this edge and continue to the next edge in the list

The algorithm above can be easily implemented. The only difficult part of this is in line 4. To check if adding an edge will create a cycle. This can be done a subroutine call to a standard search algorithm in the graph, which can detect a cycle.

Let us now analyse the complexity of the above algorithm.

Line 2: O(|E| log( |E| )

Line 4: O( |E| + |V| )

Since Line 4 is repeated O(|E|) times, the overall complexity will be

O( |E|^2 + |E|*|V| + |E| log( |E| ) ) . And in the worst case,

|E| = O( |V|^2 ),therefore this implementation has a worst case complexity of O( |V|^4  ).

We can modify the above algorithm by using a disjoint-set data structure. The working of this data structure can be found here.

Here is the modified algorithm:

Vector Edges = list of edges in the input graph
Sort the edge list based on their edge weight
Iterate through the sorted list of edge E as (u,v):
    Let pu be the parent of vertex u in the disjoint set
    Let pv be the parent of vertex v in the disjoint set
    if pu !=pv:
        Add the edge to the MST
        Make parent of u = parent of v in the disjoint set data structure

Let us now analyse the complexity of this new algorithm.
Line 2: Sorting takes O( |E| log( |E| ) ). This is same as before

The entire loop takes an amortised complexity of O( |E| log( |V| ) ). This is because Line 4 and Line 5, uses the disjoint set find operation.

And this has an amortised complexity of

O( log( |V| ) ). Hence, the overall complexity of the loop is

O( |E| log( |V| ) ). And as before, if we consider the worst case complexity of

|E| = O( |V|^2 ), the algorithm takes O( |V|^2 log( |V| ) ) time.

Here is a link to the JAVA implementation of Kruskal’s Algorithm –