SequenceAnalysis

R package SequenceAnalysis. Provides: 1) By Giving UniProtKB, the Protein Sequence will be returned from UniProt database. 2)By Giving UniProtKB, the Nucleotide Sequence will be returned from EBI database. 3) Amino acid Composition is calculated by four different methods: a) Twenty-two independent categories are considered, with one amino acid for each category. B) Five categories (Nonpolar Aliphatic, Nonpolar Aromatic, Polar Uncharged, Polar Positively Charged, Polar Negatively Charged) are considered according to the standard chemical structures of amino acids. c) Six categories (Nonpolar Aliphatic, Nonpolar Aromatic, Polar Uncharged, Polar Positively Charged, Polar Negatively Charged, Special cases) are considered which Cysteine, Selenocysteine, Glycine and Proline are placed in Special cases group. d)Eight categories are clustered via k-means algorithm on Physicochemical index of amino acids. 4) GC Content: Percentage of nucleotide g and c in sequence. 5) Codon usage: Frequency of occurrence of synonymous codons. 6) Stacking Energy: The NN model for nucleic acids assumes that the stability of a given base pair depends on the identity and orientation of neighboring base pairs. Stacking Energy = DeltaG(total) = Sigma (n(i)*DeltaG(i)) + DeltaG(init) + DeltaG(end) + DeltaG(sym), which DeltaG for i, init and end is obtained by Unified NN free energy parameter. Symmetry of self-complementary duplexes is also included by DeltaG(sym) equals to +0.43 (kcal/mol) if the duplex is self-complementary and zero if it is non-self-complementary. 7) Complement of desired nucleotide sequence. 8) Reverse of desired nucleotide sequence. 9) Reverse-Complement of desired nucleotide sequence. 10) Protein, Gene and Organism of desired UniProt ID.