HTSeq – A Python framework to work with high-throughput sequencing data. Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard work flows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data such as genomic coordinates, sequences, sequencing reads, alignments, gene model information, variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability: HTSeq is released as open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index https://pypi.python.org/pypi/HTSeq.
Keywords for this software
References in zbMATH (referenced in 8 articles )
Showing results 1 to 8 of 8.
- Dadaneh, Siamak Zamani; Qian, Xiaoning; Zhou, Mingyuan: BNP-seq: Bayesian nonparametric differential expression analysis of sequencing count data (2018)
- Wélliton de Souza, Benilton de Sá Carvalho, Iscia Lopes-Cendes: Rqc: A Bioconductor Package for Quality Control of High-Throughput Sequencing Data (2018) not zbMATH
- Wolff, Alexander: Analysis of expression profile and gene variation via development of methods for next generation sequencing data (2018)
- Faisal, Shahla; Tutz, Gerhard: Missing value imputation for gene expression data by tailored nearest neighbors (2017)
- Papastamoulis, Panagiotis; Rattray, Magnus: Bayesian estimation of differential transcript usage from RNA-seq data (2017)
- Carugo, Oliviero (ed.); Eisenhaber, Frank (ed.): Data mining techniques for the life sciences (2016)
- Kruppa, Jochen; Kramer, Frank; Beißbarth, Tim; Jung, Klaus: A simulation framework for correlated count data of features subsets in high-throughput sequencing or proteomics experiments (2016)
- Mathé, Ewy (ed.); Davis, Sean (ed.): Statistical genomics. Methods and protocols (2016)