CodeQuest: scalable source code queries with datalog. Source code querying tools allow programmers to explore relations between different parts of the code base. This paper describes such a tool, named codeQuest. It combines two previous proposals, namely the use of logic programming and database systems. As the query language we use safe Datalog, which was originally introduced in the theory of databases. That provides just the right level of expressiveness; in particular recursion is indispensable for source code queries. Safe Datalog is like Prolog, but all queries are guaranteed to terminate, and there is no need for extra-logical annotations. Our implementation of Datalog maps queries to a relational database system. We are thus able to capitalise on the query optimiser provided by such a system. For recursive queries we implement our own optimisations in the translation from Datalog to SQL. Experiments confirm that this strategy yields an efficient, scalable code querying system.

This software is also peer reviewed by journal TOMS.