SeqWare Query Engine: storing and searching sequence data in the cloud. Results: In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net webcite).
References in zbMATH (referenced in 1 article )
Showing result 1 of 1.
- Daugelaite, Jurate; O’Driscoll, Aisling; Sleator, Roy D.: An overview of multiple sequence alignments and cloud computing in bioinformatics (2013)