Archive for November, 2009

We’re mad people

Friday, November 27th, 2009

Two friends and I came up with an idea to come up with a wiki for our internal group usage to help with community building. Know what? It’s an insane effort! Within two weeks, we have a domain name purchased, a wiki set up, several articles at almost complete stage, and 3 mad administrators. Soon there will be more mad people helping us with the wiki.

The best part is: more than 2000 pageviews in the past 9 days! And that’s without much publicity yet! Insane!

P.S. Btw, yes, the post title has ambiguous grammar (well not really, if you are speaking formally, but in informal speech, it is kind of ambiguous).

Google Wave account first preview

Thursday, November 12th, 2009

I finally had a chance to play around with my Google Wave account, now that I’m slightly freer. Well, to be really honest, at first, I was a little lost. I didn’t have many contacts with Wave accounts yet (at least not in my highly contacted group of friends). Plus, I was clueless on what I wanted to do. Well, the most natural step to take would be to invite more friends to join Wave. That was exactly what I did. I invited a couple of friends to Wave. (Yes, I was pretty lucky to have a Wave account from a friend who is a Googler, hence the extra invites.)

Didn’t take long before a few of them got the invites soon afterwards. I think it takes the Wave team at least a day or two to send an invite once you nominate your friends. Anyway, now we have more interesting stuffs to do.

The first wave: I started a wave on the recipe of the chicken katsu my girlfriend and I cooked last weekend. I might not look like it, but I do cook often enough, and the katsu was really good! Even my girlfriend’s sister, who has a high standard for food, liked it. Well, back to the point, it was kind of fun to write the recipe in Wave. But wait a sec, I can do that just using Google Docs, can’t I? Or even this blog. Well, let’s put that thought on hold for a moment…

The second wave: my girlfriend went online and she started playing with her Wave account. She was initially confused. So we both started a wave (oh, and I shared the katsu recipe wave with her) and started experimenting. This is where Wave really shines. We were able to edit any of the sub-waves together and the changes appear live on the other person’s Wave account. I can see exactly where she’s editing and can even reply to her post before she finished writing. We also played (well, I did) with the playback function. Really cool!

The third wave: well, I am supposed to write an article for this small blog-based publication that my friend is running. Since she has a Wave account too (and, like me, has no idea what to do with it), I suggested that we might want to try writing the article as a wave. My idea would be to write the article (outline and the actual article), brainstorms, and have her editing the article as necessary. Thought that is a neat idea, but we’ll only be able to test that next week since I’m still kind of busy this week. Seems fun though! Imagine live collaboration on an article, bouncing ideas through a wave or gtalk, while writing the article at the same time. Plus, I’ll be able to see her edits appearing live on my wave.

Well, anyway, back to the thought that I put on hold some moments ago: Google Wave equals Google Docs? Is that true? Well, yes and no! Wave can indeed work like Google Docs (well, right now minus the spreadsheet/presentation stuffs, but I bet there will be a gadget to take care of that soon). And that is indeed the point! Wave is Google Docs + GTalk + GMail + wiki! It’s like this all-in-one, all encompassing collaborative solutions. Cool? Hell yeah. Will it work? Only time will tell.

Well, I had fun waving today. But I shall stop here. Still have a presentation to write for Monday and I need to get a draft of it by tomorrow. ):

Largest value sub-array [part 1]

Wednesday, November 11th, 2009

I was teaching a bunch of freshman the other day. One of the questions that came up was to determine the largest value sub-array given an array of integers. Obviously, the value is the array can be negative. There are two variants of the questions, one to find the actual maximum value when each of the members of the sub-array are added together, and the second variant is to actually find the start and ending location of the sub-array.

We will first consider the first variant to arrive at optimum answer.

A simple answer would be to perform brute force search in the array, considering each position as possible starting and ending point. Time complexity: O(n^2). Not very good eh? Can we improve on this? Yes, we can if we use dynamic programming. For many of us, the algorithm came up as obvious, but I realize that it is really not that simple for students new to programming to come up with a correct algorithm. To answer this, I tried to explain the concept behind the algorithm: for each element of the array, we find the maximum value sub-array that ends at that element. This value can either be the value of the element itself (meaning starting at the element and ending at the particular element) or the maximum value sub-array ending at the previous element added to the value of the element. (The latter only occurs if the maximum value sub-array at previous location is not negative.) There you go, we have a recurrence relation: given an array A, let v(k) be the maximum value of any sub-array ending at position k, where k ranging from 0 inclusive to N exclusive, N is the length of the array. Hence we have:

v(0) = A[0]
v(k) = max{v(k-1) + A[k], A[k]} if k > 0

The maximum value will be given by max{v_i} for i from 0 to N-1. To implement this in Java, we make several optimization. Firstly, we can opt for recursive method or dynamic programming. Both will take O(n) time but since we’re using Java, we would rather use dynamic programming since there is no tail call optimization in Java. Then, rather than finding the maximum value using max{v_i} at the end, we keep a running maximum throughout the running of the algorithm and return it. We also note that v(k-1) + A[k] will only be greater than A[k] iff v(k-1) is greater than 0. Here is how the code goes (this is a rather faithful implementation, it uses the same variable names as the recurrence relation above to make it easier to see):

public static int calculateMaxValue(int[] A) {
  int[] v = new int[A.length];
  int max = v[0] = A[0];
  for (int i = 1; i < A.length; ++i) {
    v[i] = A[i];
    if (v[i-1] > 0) {
      v[i] += v[i-1];
    }

    if (v[i] > max) {
      max = v[i];
    }
  }
  return max;
}

This algorithm has O(n) time complexity.

What else can we do to optimize this code? Well, we don’t really need to keep an array of maximum values. We notice that at any point of time, to calculate the value of v(k), we only need the value of A[k] and v(k-1). We just turn an O(n) space requirement to O(1).

public static int calculateMaxValue(int[] A) {
  int max = A[0], current = A[0];
  for (int i = 1; i < A.length; ++i) {
    if (current < 0) current = 0;
    current += A[i];

    if (current > max) max = current;
  }
  return max;
}

Voila! We shall continue with the second variant next time. (:

Random sample from a long linked list

Thursday, November 5th, 2009

Sorry for the long no-update period. I had quite a few posts that I wrote half way in my posting queues but haven’t got the time to finish them since I was very busy with work and research. Plus I’m currently learning basic German and that really sucks out a lot of my free time. (Though German is really, really fun; I really hope that one day, I can actually work—maybe temporarily, maybe permanently— in Germany.)

Anyway, a friend of mine is currently applying for jobs in several “big” companies (you know, the Google-Microsoft-Apple kind?). So he was preparing for his interview by hunting quite a few online questions. One of the questions he asked reminded me of a question I used to use to exemplify the kind of questions these companies used in their technical interview:

Given a head pointer to a (possibly very long) linked list of unknown length, devise a method to sample m members at random. Ensure that the method yields equal probability for each element in the linked list to be picked.

Obviously we can traverse the list once to find the length, generate m random numbers from 1 to length of the list and traverse the list another time to pick up these items. A common question would be to prevent you from going through the list more than once. Say the list is very long, the disk I/O you need to get the items from harddisk (even after making sure the items are arranged well that it minimizes cache misses) would make traversal a very expensive operation and hence should be minimized.

The solution I have is inspired by the fact that if the linked list contained m or less elements, I have no choice but to pick all m of them. Hence, the main idea of the solution would be to pick the first m elements and keeping track of k—the number of elements we have seen so far. When we encounter a new element, the new element has an m/k chance of being picked. If it is picked, we dropped 1 of the previously selected m elements at random. We can easily prove that this indeed yields the correct probability via mathematical induction.

Now of course it won’t be fun to implement this in Java or C++, so I decided to use Scheme to code this, just for the fun of it. The solution is surprisingly short and pretty. Here goes:

(define (sample-m lst m)
  (define (helper curr k m-vec)
    (if (empty? curr)
        (if (< k m)
            lst
            (vector->list m-vec))
        (let ((pos (random k)))
          (cond ((<= k m)
                 (vector-set! m-vec
                              (- k 1)
                              (car curr)))
                ((< pos m)
                 (vector-set! m-vec
                              pos
                              (car curr))))
          (helper (cdr curr) (+ k 1) m-vec))))
  (helper lst 1 (make-vector m)))

;;; Let's test this!
(sample-m (build-list 1000 values) 7)

EDIT: I decided to try writing a version for Java as well. Here is the Java code:

public class LinkedListSampler {
  private static Random rand = new Random();
  public static <T> List<T> sample(
      LinkedList<T> list, int m) {
    ArrayList<T> samples = new ArrayList<T>(m);
    int k = 0;
    for (T element : list) {
      int pos = k++;
      if (pos < m) {
        samples.add(element);
      } else if ((pos = rand.nextInt(k)) < m) {
        samples.set(pos, element);
      }
    }
    return samples;
  }
}

EDIT 2: Btw, seriously, the Java code is just for fun. Just imagine that LinkedList does not have a size method, will ya? ;)