Saturday 28 December 2013

Optimal Binary Search Trees

A Sample Set of Word Probabilities

Key Probability
and .150
cabbage .025
come .050
has .025
king .050
of .125
pig .025
ring .075
said .075
talk .050
the .150
thing .075
time .050
walrus .025
wing .050
Given the search probabilities listed above, if these words were organized in a balanced binary search tree, the average weighted search cost would be 3.35.

The Optimal BST Algorithm

Definitions
A(low, high, r)
The minimum weighted cost for subproblem (low, high) when key Kr is chosen as the root of its binary search tree.
A(low, high)
The minimum weighted cost for subproblem (low, high) over all choices of root key.
p(low, high)
The probability that the key being searched for is some key in the range Klow through Khigh; the weight of the subproblem (low, high).

Formulas
p(low, high) = plow + ... + phigh
A(low, high, r) = pr + p(low, r - 1) + A(low, r - 1) + p(r + 1, high) + A(r + 1, high)
      = p(low, high) + A(low, r - 1) + A(r + 1, high)
A(low, high,) = min {A(low, high, r) | low <= r <= high}
Code
Code for the optimal BST algorithm in Java: highlighted     text
Output: Cost Matrix
    0      1      2      3      4      5      6      7      8      9      10     11     12     13     14     15
  0.000  0.150  0.200  0.325  0.400  0.575  0.925  1.000  1.225  1.450  1.625  2.125  2.375  2.575  2.700  2.925   1
         0.000  0.025  0.100  0.150  0.275  0.550  0.600  0.775  1.000  1.175  1.675  1.900  2.100  2.225  2.400   2
                0.000  0.050  0.100  0.225  0.475  0.525  0.700  0.925  1.100  1.575  1.800  2.000  2.125  2.300   3
                       0.000  0.025  0.100  0.300  0.350  0.525  0.750  0.925  1.350  1.575  1.775  1.900  2.075   4
                              0.000  0.050  0.225  0.275  0.450  0.675  0.850  1.250  1.475  1.675  1.800  1.975   5
                                     0.000  0.125  0.175  0.350  0.550  0.700  1.100  1.325  1.500  1.600  1.775   6
                                            0.000  0.025  0.125  0.275  0.400  0.750  0.925  1.075  1.175  1.350   7
                                                   0.000  0.075  0.225  0.325  0.675  0.825  0.975  1.075  1.250   8
                                                          0.000  0.075  0.175  0.450  0.600  0.750  0.850  1.025   9
                                                                 0.000  0.050  0.250  0.400  0.550  0.650  0.825  10
                                                                        0.000  0.150  0.300  0.450  0.550  0.725  11
                                                                               0.000  0.075  0.175  0.250  0.375  12
                                                                                      0.000  0.050  0.100  0.225  13
                                                                                             0.000  0.025  0.100  14
                                                                                                    0.000  0.050  15
                                                                                                           0.000  16

Output: Root Matrix
Entries highlighted in red indicate the roots of the subtrees of the optimized BST.
      0      1      2      3      4      5      6      7      8      9     10     11     12     13     14     15
     -1      1      1      1      1      1      3      3      6      6      6      6      6      6      6      6   1
            -1      2      3      3      3      5      6      6      6      6      6      9      9      9     11   2
                   -1      3      3      4      5      6      6      6      6      9      9      9      9     11   3
                          -1      4      5      6      6      6      6      6      9      9      9      9     11   4
                                 -1      5      6      6      6      6      6      9      9      9      9     11   5
                                        -1      6      6      6      8      8      9      9     11     11     11   6
                                               -1      7      8      8      9      9     11     11     11     11   7
                                                      -1      8      8      9      9     11     11     11     11   8
                                                             -1      9      9     11     11     11     11     11   9
                                                                    -1     10     11     11     11     11     11  10
                                                                           -1     11     11     11     11     11  11
                                                                                  -1     12     12     13     13  12
                                                                                         -1     13     13     14  13
                                                                                                -1     14     15  14
                                                                                                       -1     15  15
                                                                                                              -1  16

The Optimal BST
of
/ \
and the
\ / \
come said time
/ \ / / \
cabbage king ring thing wing
/ / /
has pig walrus





No comments:

Post a Comment