CCFinder: a multilinguistic token-based code clone detection system for large scale source code. A code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and tools have been proposed. This paper proposes a new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison. For its implementation with several useful optimization techniques, we have developed a tool, named CCFinder (Code Clone Finder), which extracts code clones in C, C++, Java, COBOL and other source files. In addition, metrics for the code clones have been developed. In order to evaluate the usefulness of CCFinder and metrics, we conducted several case studies where we applied the new tool to the source code of JDK, FreeBSD, NetBSD, Linux, and many other systems. As a result, CCFinder has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems. In addition, we have compared the proposed technique with other clone detection techniques.

References in zbMATH (referenced in 12 articles )

Showing results 1 to 12 of 12.
Sorted by year (citations)

  1. Qu, Wei; Jia, Yuanyuan; Jiang, Michael: Pattern mining of cloned codes in software systems (2014)
  2. Bettenburg, Nicolas; Shang, Weiyi; Ibrahim, Walid M.; Adams, Bram; Zou, Ying; Hassan, Ahmed E.: An empirical study on inconsistent changes to code clones at the release level (2012)
  3. Arbuckle, Tom: Studying software evolution using artefacts’ shared information content (2011)
  4. Bakota, Tibor: Tracking the evolution of code clones (2011)
  5. Tiarks, Rebecca; Koschke, Rainer; Falke, Raimar: An extended assessment of type-3 clones as detected by state-of-the-art tools (2011)
  6. Ferrari, Remo; Miller, James A.; Madhavji, Nazim H.: A controlled experiment to assess the impact of system architectures on new system requirements (2010)
  7. Thummalapenta, Suresh; Cerulo, Luigi; Aversano, Lerina; Di Penta, Massimiliano: An empirical study on the maintenance of source code clones (2010)
  8. Roy, Chanchal K.; Cordy, James R.; Koschke, Rainer: Comparison and evaluation of code clone detection techniques and tools: A qualitative approach (2009)
  9. Kapser, Cory J.; Godfrey, Michael W.: “Cloning considered harmful” considered harmful: Patterns of cloning in software (2008)
  10. Reformat, Marek; Chai, Xinwei; Miller, James: On the possibilities of (pseudo-) software cloning from external interactions (2008)
  11. Sim, Susan Elliott; Di Penta, Massimiliano: Guest editors’ introduction: Special issue from the 13th working conference on reverse engineering (WCRE 2006) (2008)
  12. Tairas, Robert; Gray, Jeff: An information retrieval process to aid in the analysis of code clones (2008)