Flexible sequence similarity searching with the FASTA3 program package. Since the publication of the first rapid method for comparing biological sequences 15 years ago (1), DNA and protein sequence comparisons have become routine steps in biochemical characterization, from newly cloned proteins to entire genomes. As the DNA and protein sequence databases become more complete, a sequence similarity search is more likely to reveal a database sequence with statistically significant similarity, and thus inferred homology, to a query sequence. Indeed, even in the archaebacterium Methanococcus jannaschii, more than 40% of the open reading frames could be assigned a function based on significant sequence similarity to a protein of known function (2).