Euler Totient Function

In this post I will explain yet another simple number theoretic function, called the Euler Totient Function. This is a simple function which is used indirectly in many problems in programming contests. The function  is denoted as phi(n).  phi(n) basically gives the number of integers greater than 0 and lesser than n+1 which are co-prime to the integer n. In other words, it is the cardinality of the set

S = { a: 1<a<n , gcd(a,n) = 1 }. To calculate this function, we will use a straight forward brute force approach. With a simple analysis, we can show

that the running time of this algorithm is O(sqrt(n)).


- Start with an estimate of the totient function as n
- Iterate for i from 2 to sqrt(n)
- If i | n , we know that all factors of i less than equal
  to n will have gcd with n as atleast i. Hence, they can
 never be in the set S. Hence remove them from the result.
- While i|n, divide n by i.
- After the iteration of i from 2 to sqrt(n),
   if n is a number greater than 1, remove all
   its multiples less than n also from the result

The analysis of this code is similar to  the analysis for prime factorization. Essentially at the core of the algorithm, we are finding prime factors of n and removing all its multiples. Hence the numbers left are the ones that are co-prime to n.

Hence, both correctness and the running time analysis follows from the prime factorization algorithm.

Link to a java implementation.

Number Theoretic Algorithms – Factorization

In the following series of posts, I will write about some of the number theoretic algorithms that one encounters in most of competitive programming. These are fundamental algorithms which most computer scientist should know.

In this post I will post an algorithm and a corresponding JAVA code to perform prime factorization of a give number.

The algorithm to perform prime factorization is a very trivial algorithm and mostly encountered by everyone atleast once in their high school.


- Iterate for i from 2 to sqrt(n)
- While i|n, divide n by i. Report i as a prime factor of n
- After the iteration of i from 2 to sqrt(n),
   if n is a number greater than 1, report that also as a prime factor

The key observation is that, this algorithm runs in O(square root (n) ).

We are able to iterate only upto sqrt(n) since, we can claim one of the following:

1) n is a prime number

or

2) If n has a prime factor p greater than sqrt(n), then there exists a corresponding prime factor p’ lesser than or equal to sqrt(n), such that

p*p’ = n

If (1) doesn’t hold, it implies that n is a composite number. And Let us assume (2) doesnt hold.

This implies for a prime p such that p|n, there is no p’ <= sqrt(n) such that p’|n.

In particular n/p > sqrt(n) . And since p>sqrt(n)

=> 1/p < 1/sqrt(n)

=> n/p <n/sqrt(n) = sqrt(n).

This provides the contradiction. Hence the claim above proves the correctness of the algorithm.

Link to a JAVA implementation of the above factorization.

Floyd-Warshall ‘s All Pair Shortest Path Algorithm

In the previous two posts, I explained two different algorithms to find the shortest path in the graph. In this post I will define another very simple yet important problem in Graph Theory. Its called the “All Pair Shortest Path” problem.

Given a graph G, as a set of vertices V and directed weighted edges E, the aim of the problem is to find the cost of shortest path between every pair of vertices. Cost of a shortest path is defined as the sum of the edge weights in the shortest paths.

At first sight this problem might look really hard to approach. But the following theorem will make this one of the easiest to approach and especially code it.

Theroem:

If A represents the adjacency matrix of the graph, then for an integer k<=|V| , the (i,j) entry in the matrix A^k , gives the shortest path of length atmost k, between i and j in the original graph.

This above theorem can be easily proved using induction on k. I will leave that as a simple exercise.

From the theorem, all that is left to find is the matrix A^n.

A trivial way to compute this would be to do ‘n’ matrix multiplications. This would take a complexity of O(|V|^4).

Instead, Floyd and Warshall gave a Dynamic Programming based approach.

for k from 1 to n:
    for i from 1 to n:
        for j from 1 to n:
            A[i][j] = min(A[i][j],A[i][k]+A[k][j])

The above algorithm has a very simple intuition. The shortest path from i to j can be either the current value, or sum of shortest path from (i,k) and shortest path from (k,j). Instead of calculating this value repeatedly, we will use a dynamic programming approach and calculate once and store it.

The above algorithm has 3 loop each running for O(|V|) time. Hence the overall complexity is O(|V|^3).

A java implementation of the above algorithm can be found here.

Prim’s Algorithm

In the last post, I defined a spanning tree and gave an algorithm called the ‘Kruskal’s Algorithm’. In this post, I am going to describe another algorithm that also helps in computing the minimum spanning tree of a graph. This algorithm is called the ‘Prim’s Algorithm’ . The basic intuition behind the algorithm goes as follows: Firstly, every vertex should be reachable from every other vertex for it to be a tree. So we will try to build the tree by adding one vertex after another into the connected component. Since we want a “minimum” such tree, we will use the edge between the new vertex and the old component, that is of the minimum weight. This intuition is formalized below as an algorithm:

Set ConnectedSet = Pick a random vertex v of vertex set
Set ToBeAddedSet = set of vertices except vertex v
Set ListOfBridgeEdges = Set of Edges

while ToBeAddedSet not empty:
    - Select the minimum edge e from the ListOfBridgeEdges
       such that it has exactly one end in ConnectedSet

    - Add the other end of e to the ConnectedSet and remove
      it from the ToBeAddedSet

Let us now analyze the complexity of the above algorithm:

1) The outer while loop runs for O(|V|) times.

2) Inside each loop iteration, it takes O(|E|) to select the minimum weighted edge such that it has exactly one end in ConnectedSet.

Therefore the overall running time of this algorithm is O(|V| * |E|). And in the worst case |E| is O(|V|^2). Hence, the worst case running time will be O(|V|^3). We can now improve the above algorithm by using a different data structure to store the bridge edges. We will store the edges in a min priority queue(a heap structure).

- Initialise an empty min priority queue Q.
- Store a <key,value> pair in the queue,where key is the comparator for the queue.

- for all the vertices v in V:
    Initialise the key as infinity
    Initialise the value as NULL

- Choose a random vertex v from V.
  Initialise the key as 0.

- Push v into Q
- while Q is not NULL:
     Assign u as the extract minimum from priority queue Q
     for all v neighbour of u:
         if v is present in Q and edgeweight(u,w)< key value of v:
             Assign v->value as u
             Assign v->key as edgeweight(u,w)

Now using the predecessors information present in the v->value, we can build the MST.

Let us now analyse the complexity of this modified algorithm

1) The outermost while loop runs for O(|V|) iterations.

2) For each iteration, the extract minimum from the priority queue takes O(log |V|). Hence the complexity is O(|V| log |V| ) for L13.

3) To analyse the inner loop is slightly tricky. Notice that the total number of times the outer loop + the inner loop executes is exactly 2*|E|. Because each edge is counted twice. Once for each of its end vertices.

4) Each time inside the inner loop, the value of the key is potentially changed. And to insert this back in the priority queue it takes a complexity of O(log |V|). Hence the inner loop overall takes O(|E| log |V|).

Hence, with this modified algorithm, the overall complexity is O(|V| log |V| + |E| log |V|).  And for |E| = O(|V|^2), the algorithm runs in O(|V|^2 log |V|), which is a improvement over O(|V|^3).

A JAVA implementation of the Prim’s Algorithm can be found here.

Kruskal’s Algorithm

I planned to write a serious of posts on working and analysis of some of the basic algorithms. I also planned to give link to a JAVA implementation of the same. I will start off this series by first describing the Kruskal’s Algorithm for finding the cost of the Minimum Spanning Tree in a graph.

Minimum Spanning Tree: A minimum cost spanning tree is a subgraph induced on the original graph such that the induced sub-graph is a tree on all the vertices on the original graph and the sum of weights of the edges on this graph is the minimum among all the possible spanning trees of the graph.

The basic intuition behind the Kruskal’s algorithm is that the least weighted edge of the original graph will appear in the minimum spanning tree. Hence, sorting the edges based on their weight might be a first step in building the spanning tree. Does that mean, the first n-1 least weight edges are sufficient to build the spanning tree (If n is the number of vertices in the graph) ? We also need to ensure that the edges picked for the spanning tree do not form a cycle. Hence, that is the second key observation. The two observations is formalized below as algorithm:

Vector Edges = list of edges in the input graph
Sort the edge list based on their edge weight
Iterate through the sorted list:
    If adding this edge to the MST does not create a cycle:
        Add this edge to the MST
    Else
       Drop this edge and continue to the next edge in the list

The algorithm above can be easily implemented. The only difficult part of this is in line 4. To check if adding an edge will create a cycle. This can be done a subroutine call to a standard search algorithm in the graph, which can detect a cycle.

Let us now analyse the complexity of the above algorithm.

Line 2: O(|E| log( |E| )

Line 4: O( |E| + |V| )

Since Line 4 is repeated O(|E|) times, the overall complexity will be

O( |E|^2 + |E|*|V| + |E| log( |E| ) ) . And in the worst case,

|E| = O( |V|^2 ),therefore this implementation has a worst case complexity of O( |V|^4  ).

We can modify the above algorithm by using a disjoint-set data structure. The working of this data structure can be found here.

Here is the modified algorithm:

Vector Edges = list of edges in the input graph
Sort the edge list based on their edge weight
Iterate through the sorted list of edge E as (u,v):
    Let pu be the parent of vertex u in the disjoint set
    Let pv be the parent of vertex v in the disjoint set
    if pu !=pv:
        Add the edge to the MST
        Make parent of u = parent of v in the disjoint set data structure

Let us now analyse the complexity of this new algorithm.
Line 2: Sorting takes O( |E| log( |E| ) ). This is same as before

The entire loop takes an amortised complexity of O( |E| log( |V| ) ). This is because Line 4 and Line 5, uses the disjoint set find operation.

And this has an amortised complexity of

O( log( |V| ) ). Hence, the overall complexity of the loop is

O( |E| log( |V| ) ). And as before, if we consider the worst case complexity of

|E| = O( |V|^2 ), the algorithm takes O( |V|^2 log( |V| ) ) time.

Here is a link to the JAVA implementation of Kruskal’s Algorithm -

__name__ for Python class and Python function

This is my first post in the series of small random learnings.

The attribute __name__ is by default present in every python function. But suprisingly an object of a class doesnt have this attribute unless specifically specified.


class example():
    def classFunction():
        return "test"

obj = example()
print obj.__name__  #This is an error

On the other hand ,

def functionA():
    return "A"
def functionB(params):
    return params.__name__

print functionB(functionA) # prints functionA

Neo 4J Graph Database

In this post, I will give a brief introduction to Neo 4J database, installation and basic usage with JAVA .

Firstly, Neo4J is a NO-SQL database. It is used to store data in the form of a graph data structure , i.e. a structure having nodes and edges(called relationships) among the various nodes. This form of database finds its use case in many applications such as storing a social network in systems like facebook , twitter ,etc . More information on what Neo 4J can be found here .

Now, coming to the setup of neo 4J graph database. You can download the installation package from the official site here. Save the tar in a local folder and extract the contents. For this post I am going to run a sample code in neo 4j using eclipse(Details about eclipse can be found here).

Open the eclipse IDE and create a new project , say “firstNeo4jApp”. Add a class named graphDb.java in the default package. Now here comes the important step, to add the neo4j libraries to the build path of eclipse.(Improperly configured build path can lead to errors while compiling the code). To add the neo4j libraries , right click on the project name > Build Path . In the window that appears, select libraries tab and click on external JAR. Now, browse to the downloaded neo4j package folder and go to the lib sub folder. Select all the JAR files in that folder and click on OK. Now, refresh the project and the build path will be configured. You will be able to use the neo4j libraries in you code now.

Let us now look at a simple neo4j example.In this example we will see how to create a node ,set relationship between two nodes and how to get the relationship between two nodes.

Creating a Node in neo 4j


public Node createNode() {
	Node n = null;
	Transaction tx = GraphDb.getDb().handle().beginTx();

	try {
		n = this.gdb.createNode();
		n.setProperty("property1", "property1 key");
		tx.success();
	} finally {
		tx.finish();
	}
	return n;
}

Every write operation to the graph in neo4j must be written within a transaction. For this purpose , we will use the interface org.neo4j.graphdb.Transaction. To start a transaction, we will use the beginTx() function of the org.neo4j.graphdb.GraphDatabaseService interface. this interface also provides the function createNode() which creates a node in the graph. We can set numerous properties to the node based on a simple key-value schema. Also, it is important to finish every transaction. The transaction will not release the locks or memory it has acquired until it has been finished.

Set relationship between two nodes

public static void set(Node A, Node B) {
	Transaction tx = GraphDb.getDb().handle().beginTx();
	Relationship relship = null;

	try {
	        RelationshipIndex rolesIdx = GraphDb.getDb().relationshipIndex();
                
                //Creates a relationship between node A and node B  
	        relship = A.createRelationshipTo(B, RelationshipEnum.RELATED);
                
                //Adds special properties to relation; a key-value pair
		relship.setProperty("key 1", "value 1");
                
                //Adds the relationship object to the graph 
		rolesIdx.add(relship, "name", "related");
		rolesIdx.add(relship, "id", relship.getId());
		tx.success();
	} finally {
		tx.finish();
	}
}

public enum RelationshipEnum implements RelationshipType {
		RELATED
}

Again, since a write operation is performed on the graph, we have to enclose it within a transaction. We can also set properties to the edge(or relationship) like in this case, the property is “key 1″ and the value is “value 1″. Finally , the Relationship object is added to the graph. We are intentionally storing it as “name”:”related pair and “id”:id pair,so that we can query based on the name property also.

Get relationship between two nodes

public static List get(Node A, Node B) {
	RelationshipIndex rolesIdx = GraphDb.getDb().relationshipIndex();
        
        // Get the edge(or relationship) between Node A and Node B having the key value pair as "name":"related" 
	IndexHits roles = rolesIdx.get("name", "related", A, B);

	List rships = new ArrayList();

	//Convert the IndexHits to a List 
        try {
             
	    for (Relationship role : roles) {
                rships.add(role);
	    }
	} finally {
		roles.close();
        }

	return rships;

}

As for the querying the graph, since no write operation is performed on the graph, hence , there is no need to enclose it within a transaction.

The above pieces of code gave a overall idea of how to create a node , create a relationship between two nodes and get a particular kind of relationship between two nodes. The entire code for all three can be downloaded here .

Follow

Get every new post delivered to your Inbox.