When a successful software system is maintained and evolved for an extended period of time, original design documents become obsolete and design rationales become lost, so reverse engineering activities to reconstruct such information become critical for the software's continued viability.; Pattern matching provides a solid framework for identifying higher level abstractions that may be instances of predefined plans (commonly used algorithms and cliches), programming concepts, or abstract data types and operations. This thesis discusses two types of pattern-matching techniques developed for plan recognition in Program Understanding.; The first type is based on Software Metrics and Dynamic Programming techniques that allow for statement-level comparison of feature vectors that characterize source code program statements. This type of pattern matching is used to identify similar code fragments, and code cloning, facilitating thus code modularization, code restructuring and efficient localization of the occurrence of similar programming errors.; The second type addresses the problem of establishing correspondences, between a parse tree of a custom abstract description language developed (ACL) and the parse tree of the code. Matching of abstract representations and source code representations involves alignment that is again performed using a Dynamic Programming algorithm that compares feature vectors of abstract descriptions, and source code. The use of a statistical formalism allows a score (a probability) to be assigned to every match that is attempted. Incomplete or imperfect matching is also possible leaving to the software engineer the final decision on the similar candidates proposed by the matcher.; The system has been implemented to analyze software systems written in PL/AS and C.
展开▼