FAUST

FAUST: An algorithm for extracting functionally relevant templates from protein structures. FAUST(Functional Annotations Using Structural Templates) is an algorithm for: extraction of functionally relevant templates from protein structures and using such templates to annotate novel structures. Proteins and structural templates are represented as colored, undirected graphs with atoms as nodes and interatomic distances as edge weights. Node colors are based on chemical identities of atoms. Edge labels are equivalent if interatomic distances for corresponding nodes (atoms) differ less than a threshold value. We define FAUST structural template as a common subgraph of a set of graphs corresponding to two or more functionally related proteins. Pairs of functionally related protein structures are searched for sets of chemically equivalent atoms whose interatomic distances are conserved in both structures. Structural templates resulting from such pair wise searches are then combined ! to maximize classification performance on a training set of irredundant protein structures. The resulting structural template provides new language for description of structure—function relationship in proteins. These templates are used for active and binding site identification in protein structures. We are demonstrating here structural template extraction results for the highly divergent family of serine proteases. We compare FAUST templates to the standard description of the serine proteases active site pattern conservation and demonstrate depth of information captured in such description. Also, we present preliminary results of the high-throughput protein structure database annotations with a comprehensive library of FAUST templates.