Home

Current Projects

These annotation projects are currently hosted by GenSAS:

Community members involved in curation of these projects should use the contact form to request access to GenSAS.

The Genome Sequence Annotation Server (GenSAS) is an online annotation tool that provides a customizable automated pipeline for whole genome structural annotation. Users can upload genomic sequence and select from a variety of tools for prediction of gene models and other structural features. 

 

To use GenSAS v2.1:

 

Available Tools

Tool Type Tool Name Description
Intrinsic Gene Prediction Genscan Predicts the exon-intron locations within genes
  FGENESH HMM-based gene structure prediction for eukaryotes (Cannot run analysis, but will accept upload of FGENESH output files)
  Augustus Predicts eukaryotic genes including 5’ and 3’ UTR sequences and alternative genes
  Glimmer Predicts coding regions for prokaryotes and viruses
  GlimmerM Based on Glimmer, GlimmerM is specific for eukaryotes and the exons in those genomes
  SNAP Gene finder for prokaryotes and eukaryotes
Extrinsic Gene Prediction Transcript BLAST BLAST against transcript databases from NCBI.  The following databases are available:
  BLAT BLAT uses an index derived from assemblies of entire genomes to predict genes. The following databases are available:
  Protein BLAST BLAST against protein databases from NCBI.  The following databases are available:
Other Genetic Features getorf Finds ORFs based on start and stop codons
  SSR Server Finds simple sequence repeat sequences
  tRNAscan Identifies tRNA sequences
Misc. Tools RepeatMasker Masks interspersed repeats and low complexity DNA sequences
  RepeatModeler De novo repeat identification and modeling tool
  GFF3 Importer Allows for addition of previously run data to be viewed as Track in graphical results interface

 

Future Direction

In the future, GenSAS will migrate approved gene models to a Chado database which can then be visualzed with other GMOD tools including GBrowse and Tripal. GenSAS will also encapsulate more complex workflows such as pre-processing of a training set for some gene prediction tools as well as a consensus program to combine and prioritize gene models from various tools.