ANALYSIS OF DECOMPILED PROGRAM CODE USING ABSTRACT SYNTAX TREES
N. A. Gribkov, T. D. Ovasapyan, D. A. Moskvin Peter the Great St. Petersburg Polytechnic University (SPbPU)
Annotation: The paper proposes a method of preprocessing fragments of binary code for the task of detection their similarity using machine learning algorithms. The method is based on analysis of pseudocode, which is retrieved from decompilation process. The pseudocode is preprocessed with usage of attributed abstract syntax trees. Evaluation of the method indicates its efficiency in binary code similarity detection task due to semantic vectors used for abstract syntax tree modification.
Keywords: code clones, syntactic similarity, semantic similarity, binary code similarity, abstract syntax tree, pseudocode.