Case Study of a Learning Algorithm for the Longest Common Subsequence Problem Eric A. Breimer Mark Goldberg Based on the behavior of a supervisor-algorithm, we present a learning algorithm which designs an improved algorithm for solving the Longest Common Subsequence problem (LCS). The supervisor is the standard dynamic programming algorithm (DP) for LCS. The learning algorithm applies DP to inputs generated randomly according to a probability distribution D. The optimal solutions generated by DP are used to build a search area. The search area is then used to develop a new algorithm tailored for solving the LCS problem for inputs generated according to D. We present experiments showing the learning curve of the algorithm and the performance parameters of the new LCS-algorithm after a prescribed approximation ratio has being achieved. In particular, our experiments with two random 0,1-sequences of length n suggest that applying O(n0:630) training samples guarantees an LCS-algorithm whose approximation ratio is 0.95 and the running time is O(n1:654) (vs the quadratic time of DP). Our experiments also indicate a relationship between the distribution of the input symbols and the structure of the search area. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY cs-98-03
Case Study of a Learning Algorithm for the Longest Common Subsequence Problem
Eric A. Breimer
Mark Goldberg
Based on the behavior of a supervisor-algorithm, we present a learning algorithm which designs an improved algorithm for solving the Longest Common Subsequence problem (LCS). The supervisor is the standard dynamic programming algorithm (DP) for LCS. The learning algorithm applies DP to inputs generated randomly according to a probability distribution D. The optimal solutions generated by DP are used to build a search area. The search area is then used to develop a new algorithm tailored for solving the LCS problem for inputs generated according to D. We present experiments showing the learning curve of the algorithm and the performance parameters of the new LCS-algorithm after a prescribed approximation ratio has being achieved. In particular, our experiments with two random 0,1-sequences of length n suggest that applying O(n0:630) training samples guarantees an LCS-algorithm whose approximation ratio is 0.95 and the running time is O(n1:654) (vs the quadratic time of DP). Our experiments also indicate a relationship between the distribution of the input symbols and the structure of the search area.
Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY
cs-98-03