Analyzing Problems *Properly*

| Sunday, November 2, 2008
I'm writing this as a summary of lessons learnt (through the hard way) since I started studying this year. Some of you might have figured out by now, I'm currently taking CS1102 (Data Structures & Algorithms). It's a pretty good module, and I really enjoy it. However, when it really mattered, I always seem to screw up. I had my Practical Examination yesterday, which consisted of 2 parts, one worth 30%, and the other worth 70%.

For the 30% part, we had the following (something similar, can't remember the exact thing):

Character Sequence:
AXXAIOUAAXXUOIAXAXIOUAAXXAIOU

Test Cases:

  1. AXXAIOU

  2. AIOUAAX

  3. AAXXUOI

  4. AXAIOUA

  5. AAAXXXX

I can't remember the exact scenario, but basically the input consisted of the above character sequence, and the test cases. The test cases will always be in groups of 7 chracters. You are supposed to output the position of the sequence if found, which in the above case, you should output:

0, 22
3
7
14
Not found

Moreover, the question wants you to be able to query in less than O(n) time.

Explanation: AXXAIOU is found at position 0 and 22. AIOUAAX is found at position 3. AAXXUOI is found at position 7. AXAIOUA is found at position 14. Finally, AAAXXXX is not found in the sequence.

A X X A I O U A A X X U O I A X A X I O U A A X X A I O U
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Analysis: Based on the given example, I deduced that overlapping cases can occur, and that each sequence can occur more than once. So what did I do wrongly? I did not read the question thoroughly enough to know that only the query (test cases) needs to be less than O(n) time. I thought the entire program should not take O(n) time. So I proceeded to implement some methods using String, and still ended up with O(n) time. The truth is, since the character sequence portion does not have to be less than O(n) time, I can simply store it into a hash table as follows:

AXXAIOU
XXAIOUA
XAIOUAA

And so on. That is, I can simply substring(i, 7) as I iterate through the character sequence, and put it into a Hashtable, all the way till I hit length of sequence - 7, which would give me the last substring of:

AXXAIOU

Once everything is inserted into the Hashtable, querying/retrieving it is O(1) time. I have no idea why did I not read the question properly. Sigh. I'll probably still pass, but not get the full marks.

Next, we have the 70% part, which I came SO CLOSE to finishing it, I just UNDER-analyzed it. Basically, suppose you have the following:


7
/ \
5 4
/ / \
6 1 3

You are then supposed to calculate the largest number based on the following:

n1 + (n2 - (n3 + (n4 - (n5 + n6)))...

Meaning, using the above example, there are 3 paths:

7 + (5 - 6) = 6
7 + (4 - 1) = 10
7 + (4 - 3) = 8

Thus, the longest path is 10 in this case. The input you are given is:

7 2 3
5 1 0
6 0 0
4 1 1
1 0 0
3 0 0

Which means, to the left of 7, there are 2 nodes, and to the right of 7, there are 3 nodes. To the left of 5, there is one node, and none to the right. 6 has no child nodes. 4 has one child each on the left and right. Both 1 and 3 have no child nodes. Basically, the first digit of each line is the actual node value, given in pre-order. The second digit means the number of left child nodes, and the third digit means the number of right child nodes. Thus, you have to build the tree out, and then find the longest path.

Analysis: Basically, this problem can be broken up into 2 parts, building the tree, and then finding the longest path. I was able to build the tree relatively quickly. What I did was to have something like the following:
void buildTree(ArrayList<TempNode> nodes) {
    root = buildTree(nodes, 0);
}
Node buildTree(ArrayList<TempNode> nodes, int curr) {
    Node temp = null;

    if (curr < nodes.size()) {

        temp = new Node(nodes.get(curr));
        if (nodes.get(curr).left != 0)
            temp.left = buildTree(nodes, curr+1);
        if (nodes.get(curr).right != 0)
            temp.right = buildTree(nodes, curr+1+nodes.get(curr).right);
        return node;
    }
}
class TempNode {
    protected int value, left, right;
    TempNode(int value, int left, int right) {
        this.value = value;
        this.left = left;
        this.right = right;
    }
}

What I did was to store each line of input into a TempNode object, and all of that is then stored into an ArrayList. The tree is then built recursively, and if you study the input (since it's pre-order), you will realise that the node to the right of the current node is always the position of the current pointer + the number of left nodes of the current node + 1. Take note that the above code might be missing a things because I don't exactly remember everything, but the general concept is that. Thus, I was able to build out the tree. Now comes the longest path part. I made use of a concept similar to finding the height of the tree, which is as follows:
int findHeightOfTree(Node node) {
    if (node == null)
        return 0;
    else {
        return 1 + Math.max(findHeightOfTree(node.left), findHeightOfTree(node.right));
}
What the code does is that, starting at the node you specify (the root node in this case), it will traverse left and right and return the maximum value at that point in time. Thus, if I were to change the code to the following:
int findMaxValue(Node node) {
    if (node == null)
        return 0;
    else {
        return node.value + Math.max(findMaxValue(node.left), findMaxValue(node.right));
}
I would be able to find the maximum value, PROVIDED all the values are added together. Meaning that the above tree would give:

7 + (5 + 6) = 18
7 + (4 + 1) = 12
7 + (4 + 3) = 14

Which is very similar to what the question wants. So I thought, now I just need to add in a boolean to alternate + and -. And the code became:
int findMaxValue(Node node, boolean plus) {
    if (node == null)
        return 0;
    else {
        if (plus)
            return node.value + Math.max(findMaxValue(node.left, !plus), findMaxValue(node.right, !plus));
        else
            return node.value - Math.max(findMaxValue(node.left, !plus), findMaxValue(node.right, !plus));
}
Sadly, it did not work as expected, and I had 30 minutes before the end of the practical. I finally figured out at the 29th minute what were the mistakes, but before I could magically try to code it in 1 min, I had to press some key wrongly in Vim that resulted in it being unable to respond to me (I'm quite new to Vim, and I still don't know what I pressed). Thus, I was REALLY close, but not close enough. Here is what I realised, but could not put into code:

Firstly, I had to return the maximum value at the node itself, which means I should have the following instead:

int findMaxValue(Node node, boolean plus) {
    if (node == null)
        return 0;
    else {
        if (plus)
            return Math.max(node.value + findMaxValue(node.left, !plus), node.value + findMaxValue(node.right, !plus));
        else
            return Math.max(node.value - findMaxValue(node.left, !plus), node.value - findMaxValue(node.right, !plus));
}
I managed to do this part, thankfully. Of course, the output was still wrong. So what did I missed out (which I realised LITERALLY at the last minute)? Look at the following:

7
/ \
5 4
/ / \
6 1 3

Look at Node 5. I was doing a comparison between 5-6, and 5-0 (since 5 has no left child, the code will return 0). Of course, the code will return 5-0 as the maximum value, which is 5. That is what I missed out. I neglected the MINUS case, whereby if a node has either no right or left child, there should be no comparison taking place. For the case of plus, it is ok, because 5+6 is still more than 5+0. Alas, it was too late, and I could not complete what I SHOULD HAVE completed. Sadly =(. I haven't tried actually compiling or writing the whole thing out, but the final correct code should have been something like:
int findMaxValue(Node node, boolean plus) {
    if (node == null)
        return 0;
    else {
        if (plus)
            return Math.max(node.value + findMaxValue(node.left, !plus), node.value + findMaxValue(node.right, !plus));
        else
            if (node.left == null && node.right == null) {
                if (node.left == null) {
                    return node.value - findMaxValue(node.right);
                }
                else {
                    return node.value - findMaxValue(node.left);
                }
            }
            else {
                return Math.max(node.value - findMaxValue(node.left, !plus), node.value - findMaxValue(node.right, !plus));
            }
}
There you have it. I under-analyzed the problem, and neglected one case, and could not get the correct output. Sigh. To be honest, I'm still not quite over it, but I've learnt my lesson, and I think that is important in itself. I just don't understand, why, for 29 minutes, I thought of every possible scenario I could have gone wrong, but this. This simple case, of not taking care of a possible empty child node. Why =(?

Lesson Learnt: Analyze your problems properly. Use a highlighter if necessary. Sometimes luck REALLY isn't on your side. Sigh.

No comments: