Abstract [eng] |
This paper investigates a problem of source code comparison with a goal of finding functional code duplicates. The goal of this paper is to create and present a universal source code comparison method, which would be resistant to trivial changes. Two algorithms were chosen to achieve this goal: exact tree edit distance and approximate pq-gram, which has better performance. Both algorithms were practically applied for Java code comparison: an AST parser and a program, which compares XML documents or Java classes, were created. This paper also evaluates AST transformations as well as presents experimental results of p and q values best suited for code comparison. Args4j library is used to evaluate the quality of source code comparison results. |