Thursday, June 28, 2012

Uniqueness of MST

Question : If no two edges of an undirected weighted graph have equal weights, prove that the Minimum Spanning Tree is unique.

Proof [1]:
First, let us state and prove the Cycle Property.

Statement [2]: "For any cycle C in the graph, if the weight of an edge 'e' of C is larger than the weights of other edges of C, then this edge cannot belong to an MST."

Proof :  Suppose e is a part of an MST, T. Removing e from T splits it into 2 connected components. There must be another edge e' in cycle C with one end in each connected components of T. Also, according to the hypothesis, the weight of e' is less than that of e. So, replacing e' with e in T results in a spanning tree of the graph with a smaller total weight than T, a contradiction. Q.E.D.

Now, suppose the given graph (G) has two different MST's, T1 and T2. Then, there exists and edge e1 in G which is a part of T1, but not of T2. Add e1 to T2. We get a cycle, C.  As all the edges of C, except e1, belong to T2, an MST of G, and since all edge weights are distinct, it follows from the cycle property that e1 is the unique maximum weight edge in C, which in turn implies that it cannot be a part of any MST of G. This is a contradiction, because e1 belongs to T1, which is assumed to be an MST of G. Q.E.D.

Notes :
[1] - This proof was the result of a discussion between Salman and me.
[2] - http://en.wikipedia.org/wiki/Minimum_spanning_tree#Cycle_property

Interviewstreet Challenges : Hints

Interviewstreet Challenges have some excellent programming problems, and solving them has been a great learning experience for me so far. Even for the few problems I have solved, the solutions often involved a non-trivial component, like a data structure, or a paradigm, that I was not already aware of, or had no clue about. And I have no shame in accepting the fact that on many occasions, I Googled for hints, sometimes at length, (after trying to solve it myself for long enough, of course) to figure out the right approach. However, I made sure that I wrote all the code myself, and trust me when I say that this part is not necessarily as easy as it seems. I'm still stuck on many problems despite having figured out the right approach.


In the next few posts (including this one), I will post hints for the problems I have solved so far. I will add to this list as I solve more problems. Ideally, you should look at these hints only after having spent enough time on the questions. If you haven't thought about the question for long enough, then you probably won't understand the hints anyway.


Billboards :
1. Dynamic programming
2. It is easier if you consider which boards are removed (and try to minimize the loss), rather than considering which boards are not removed.
  Did you get the recurrence for the above case? A naive implementation of the recurrence, in which you calculate the minimum in each iteration in O(K), will result in O(N * K) running time, which will time out. 
3. The minimum can be found in O(log K), but it will require some data structure. (does it ring a bell?).
4. Better still, the minimum can be found in O(1), with very little extra work (think about how often you need to actually calculate the minimum, instead of blindly doing it in each iteration).


Equations :
1. Rewrite as A * B = C
2. Think primes.

Permutation Game :
1. N is very small. You can examine all possibilities. See this.
2. Bit-masking will help you avoid a lot of unnecessary work.

Monday, June 25, 2012

Range Minimum Query (RMQ)

Problem Statement : 
Given an array A[0, N-1] find the position of  the element with the minimum value between two given indices.

TopCoder Tutorial : 
Range Minimum Query and Lowest Common Ancestor


Video Tutorial : 












Thursday, June 21, 2012

Next Permutation


Prime Factorization

Prime factorization is a relatively simple problem, and occurs as a part of many algorithmic problems in one form or another. However, we often perform a lot of unnecessary computation when trying to solve it, like generating primes, checking for primality, etc.

Following is an elegant solution to the problem [1]: The above code prints the prime factors of a number. Each prime is printed as many times as it occurs in the prime factorization of the number. The code can be easily modified to count the no. of prime factors, determine exponent of each prime etc.

Do you see why this works?

[1] - http://www.cs.sunysb.edu/~skiena/392/programs/primes.c

Backtracking

Introduction to Backtracking : http://en.wikipedia.org/wiki/Backtracking

Implementation [1]:

Simple Exercise : Use the above template to count the number of permutations of 1, 2, 3... n (n > 5) in which 4 and 5 are not neighbours.

[1] - http://www.cs.sunysb.edu/~skiena/392/programs/

Tuesday, June 19, 2012

Monday, June 18, 2012

1's and 0's

QuestionGiven an array containing only 0's and 1's, find the maximum contiguous subarray containing equal no of 0's and 1's.

Complexity : O(n).

Source : Asked to me by Rajesh Kamineni

My Solution :

Call the original array 'A'. Let it's size be 'n'.
Create 2 arrays 'B' and 'C'of size '2n + 1' each. Initialize each to -1 everywhere.

Then, do the following [2]:
Here's what the above procedure does :
Let us denote, (No. of 1s in A before index i) - (No. of 0s in A before index i) by #(i).

Note that when we enter the i-th iteration, diff = (No. of 1s in A before index i) - (No. of 0s in A before index i) = #i
Also note that the value of diff can range from '-n' to 'n'[1].

As a result, we have
B[j] = the smallest index 'i' in the array A such that, #(i) = j - n,
C[j] = the largest  index 'i' in the array A such that, #(i) = j - n,

i.e. #(B[j]) = #(C[j])

We need the largest sub-array of A with equal number of 1s and 0s.
Suppose the required sub-array is from index 'a' to index 'b' of A (with elements at index 'a' and index 'b' included).
Then, clearly,
#(b+1) = #(a)            (Do you see why?)
And our goal is to maximize (b+1) - (a).

Therefore, we simply have to find the index 'j' such that C[j] - B[j] is maximum. This can done in O(n), in a single pass.
Once the 'j' is found, the required sub-array is from index 'B[j]' to index 'C[j] - 1' of A.

Notes :

[1] - One may further note that diff can take at most n+1 different values, and we don't explicitly require the actual value of diff. So the arrays B and C can be of size 'n + 1' each and the B[diff] can be replaced by B[diff%n].             
(Do you see why?)


[2] - Optimization suggested by Vinod :
Note that 'i' increases from '0' to 'n' as the loop is executed. As a result, the value of 'B[diff + n]' needs to be set only once, for the lowest 'i'. Similarly, subsequent values of 'i' are always greater than 'C[diff + n]', so C[diff + n] needs to be set every time.

The check 'i < n' can also be avoided by increasing the size of A to 'n + 1' elements. It does matter what the value of A[n] is, because the final value of diff is never used.

Here is the optimized code :
Feel free to post alternate solutions, extensions, comments, clarifications, optimizations, and mistakes (if any) as comments.

Thursday, June 14, 2012

Swap Using Bitwise XOR


Power of 2 - Bit Operations

A smart way to determine if a number is a power of 2 using bit operations[1] :

If x is a power of 2, then the binary representation of x looks like :

10000000

Consequently,

x - 1 = 10000000 - 00000001 = 01111111

If we take the bitwise AND of x and x-1, we get 0.

x & (x - 1) = 10000000 & 01111111 = 00000000

On the other hand, if x is not a power of 2, the change in the bits caused by subtracting 1 will only propagate as far as the least significant bit which is set to 1 i.e. the first 1 from the right. Bits to the left to this bit will not be affected. As a result, the most significant bit which is set to 1 i.e the first 1 from the left remains unchanged in x - 1. So, x & (x-1) will not be equal to zero because they have at least one common bit set to 1.

 Here's an example:

x = 10000100

x-1 = 10000100 - 00000001 = 10000011

x & (x - 1) = 10000100 & 10000011 = 10000000

Note that 0 is incorrectly detected as a power of 2 by this algorithm. As a result, one must first check whether x is non-zero.

The following code snippet implements the algorithm :


Wednesday, June 13, 2012

Binomial Coefficients - Part 3

This post exploits the recurrence :
C(n,k) = C(n-1, k) + C(n-1, k-1)

If the limits to n are known, then the above recurrence can be used for dynamic programming.

Implementation:

Binomial Coefficients - Part 2

In many problems, it is required to find the answer modulo a large prime, such as 1000000007 (hereby referred to as MAX or m). If the problem involves binomial coefficients, then they have to be calculated modulo MAX too.

It is easy to calculate factorials modulo MAX, because
(a*b) mod m = (a mod m) * (b mod m)
but the same does not hold for division, which makes it tricky to calculate binomial coefficients.

However, the following relation holds :
(a/b) mod m = (a mod m) * (c mod m)
where 'c' is the modular multiplicative inverse[1] of 'b'.

Thus, the relation
C(n, k) = fact(n)/(fact(n-k)*fact(k))
is reduced to
C(n,k) = fact(n)*invfact(n-k)*invfact(k)
where C(a,b) and fact(a) are expressed modulo m, and invfact(a) denotes the multiplicative inverse of fact(a) modulo m. Multiplication is also performed modulo m.

Implementation :

The above implementation employs efficient routines for modular arithmetric.

[1] - http://en.wikipedia.org/wiki/Modular_multiplicative_inverse

Binomial Coefficients - Part 1

Implementation 1:
int factorial(int n){
 if (n <= 0) return 1;
    int f = 1;
    do { f *= n;} while (--n);
    return f;
}

int C(int n, int k){
    if (n < k) return 0;
    return fact(n)/(fact(k)*fact(n-k));
}
It works for n <= 12.

Implementation 2:

int C(int n, int k){
    if (n < k) return 0;
    if (k > n/2) k = n - k;
    int c = 1;
    for (int i = 1; i <= k; i++) c = (c*(n-i+1))/i; 
    return c;
}
It works for n <= 29.

The important thing to note here is that, because of the way the loop is written, the division never results in a non-integer. (Why? Think about it!)

Greatest Common Divisor

template <class T>
T gcd (T a, T b){
   return b != 0 ? gcd(b, a%b) : a;
}