CloudAligner

CloudAligner: a fast and full-featured MapReduce based tool for sequence mapping. Results: To urge the trend of using Cloud technologies in genomics and prepare for advances in second- and third-generation DNA sequencing, we have built a Hadoop MapReduce-based application, CloudAligner, which achieves higher performance, covers most primary features, is more accurate, and has a user-friendly interface. It was also designed to be able to deal with long sequences. The performance gain of CloudAligner over Cloud-based counterparts (35 to 80%) mainly comes from the omission of the reduce phase. In comparison to local-based approaches, the performance gain of CloudAligner is from the partition and parallel processing of the huge reference genome as well as the reads. The source code of CloudAligner is available at http://cloudaligner.sourceforge.net/ and its web version is at http://mine.cs.wayne.edu:8080/CloudAligner/.