|
|
News
31th October 2008
NEW! SeqVis Version 1.4 is released. A number of bugs in previous versions were fixed, and some of the functionalities were renamed.
4th June 2007
NEW! SeqVis Version 1.3 is released. The problem of reading FASTA files was fixed.
30th May 2007
WARNING! A bug was detected in the fasta file reading module in SeqVis. In some cases, SeqVis fails to signal a warning when sepcial characters, such as the new line '\n' character is present. This may result in incorrect calculation of nucleotide frequency. Therefore, it is advised to AVOID reading fasta sequence file before further annoucement is made. The SeqVis team will attempt to rectify the problem as soon as possible.
NEW! SeqVis Version 1.2 is released. This version of SeqVis enables any four attributes data that sums to one to be visualized. Details are available
in Feature.
SeqVis is published in Bioinformatics! Please cite our paper:
Ho JWK, Adams CE, Lew JB, Matthews TJ, Ng CC, Shahabi-Sirjani A, Tan LH, Zhao Y, Easteal S, Wilson SR, Jermiin LS (2006)
SeqVis: Visualization of compositional heterogeneity in large alignments of nucleotides, Bioinformatics 22, 2162-2163
A detailed description of the program's features and how to use it is available from:
Jermiin LS, Ho JWK, Lau KW, Jayaswal V (2009). SeqVis: A tool for detecting compositional heterogeneity among aligned nucleotide sequences. Pp ???-???. In Bioinformatics for DNA sequence analysis (Ed. Posada D), Humana Press, Totowa, NJ. [Preprints are available from LSJ]
Introduction
SeqVis, a Java standalone application, is an interactive three-dimensional visualization tool
to explore compositional heterogeneity in large alignments of nucleotide sequences.
Existing methods for assessing compositional heterogeneity among nucleotide sequences are either
not reliable or computationally expensive for large alignments. SeqVis visualizes the nucleotide composition
in a tetrahedron model (extension of the de Finetti plot). The user-friendly
features provided by SeqVis allows compositional heterogeneous sequences to be
visually identified. The use of SeqVis is illustrated by two real phylogenetics examples.
The tool is freely downloadable here.

Fig. 1. Snapshot of SeqVis
Background
Compositional heterogeneity
Most phylogenetic methods assume that the sequences
evolved under a single time-reversible Markov process (homogeneous, stationary, and reversible conditions).
Compositional heterogeneity (ie, significant deviation of frequency of nucleotide A,T,G and C)
in sequence data suggest that they did not evolve under these conditions and therefore phylogeny
may not be accurately inferred.
Existing methods for assessing compositional heterogeneity
Currently, there are four categories of methods that detect compositional heterogeneity
(Jermiin et al., 2004) in the alignments of
nucleotides. The first category uses graphs or tables to visualize the compositional heterogeneity.
Other categories perform evaluations against the expected distributions based on test statistics.
The use of the first category is fairly limited to some species, whereas the latter is
usually either statistically invalid or not accommodated by the scientific community.
Matched-pairs tests of homogeneity for analyzing aligned nucleotides were inspired from the second problem.
These tests provide useful details on Markov process. However, the results of data containing
many sequences may be impractical.
Jermiin,L.S. et al. (2004). The biasing effect of compositional heterogeneity on
phylogenetic estimates may be underestimated. Syst. Biol., 53, 638-644.
|