Bioinformatics and Functional Genomics

Chapter: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | App 1 | App 2


Chapter 16: Completed genomes: Eukaryotes


Web resources from Chapter 16
Website URL
The Animal Genome Size Database of T.R. Gregory http://www.genomesize.com
The Database of Genome Sizes http://www.cbs.dtu.dk/databases/DOGS/
An on-line database of plant C values http://www.rbgkew.org.uk/cval/homepage.html
Pseudogenes from Mark Gerstein’s lab http://www.pseudogene.org/
The Tandem Repeats Finder http://tandem.biomath.mssm.edu/trf/trf.html
The Case Western Reserve University (CWRU) Duplication Browser http://humanparalogy.gene.cwru.edu/SDD/
Telomere database (TelDB) http://www.genlink.wustl.edu/teldb/index.html
RepeatMasker http://www.geospiza.com/products/tools/repeatmasker.htm
The Censor Server of the Genetic Information Research Institute (GIRI) http://www.girinst.org/
The tRNAscan-SE search server http://www.genetics.wustl.edu/eddy/tRNAscan-SE/
The genomic tRNA database (GtRDB) http://rna.wustl.edu/GtRDB/
Rogic et al. data http://www.cs.ubc.ca/~rogic/evaluation/
Grail Experimental Gene Discovery Suite http://compbio.ornl.gov/grailexp/
Database of Introns  
Codon usage database http://www.kazusa.or.jp/codon/
The National Institutes of Health Intramural Sequencing Center (NISC) http://www.nisc.nih.gov
PipMaker and MultiPipMaker http://bio.cse.psu.edu/pipmaker/
VISTA (Visualization Tools for Alignments) http://www-gsd.lbl.gov/vista/
The Berekeley Genome Pipeline http://pipeline.lbl.gov/
Giardia information from The U.S. Food and Drug Administration http://vm.cfsan.fda.gov/~mow/chap22.html
The Giardia genome project http://www.mbl.edu/Giardia
African trypanosomiasis http://www.who.int/tdr/diseases/tryp/default.htm
Leishmaniasis fact sheet from WHO http://www.who.int/inf-fs/en/fact116.html
Leishmaniasis information from WHO http://www.who.int/tdr/diseases/leish/default.htm
Apicomplexan University of California, Berkeley http://www.ucmp.berkeley.edu/protista/apicomplexa.html
Apicomplexan Tulane University http://www.tulane.edu/~wiser/protozoology/notes/api.html
Malaria from The Wellcome Trust http://www.wellcome.ac.uk/en/malaria/
Malaria from WHO http://www.who.int/tdr/diseases/malaria/default.htm
Malaria genetics and genomics http://www.ncbi.nih.gov/projects/Malaria/
The Prediction of Apicoplast Targeted Sequences (PATS) database http://gecco.org.chemie.uni-frankfurt.de/pats/pats-index.php
The Paramecium genome project http://paramecium.cgm.cnrs-gif.fr/
The Tetrahymena genome project http://lifesci.ucsb.edu/~genome/Tetrahymena/
The Toxoplasma gondii database ToxoDB http://ToxoDB.org
Eukaryotic genome projects from TIGR http://www.tigr.org/tdb/euk/
The Angiosperm Phylogeny Web Site http://www.mobot.org/MOBOT/Research/APweb/welcome.html
The NCBI plant resources http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/PlantList.html
MtDB-Medicago trunculata database http://www.medicago.org/
ZmDB-Zea maize database http://zmdb.iastate.edu/
GénoPlante-Info (GPI) http://genoplante-info.infobiogen.fr/
Sputnik http://mips.gsf.de/proj/sputnik/
GrainGenes http://www.graingenes.org
The TIGR Arabidopsis database http://www.tigr.org/tdb/e2k1/ath1/
Arabidopsis genome duplications http://www.tigr.org/tdb/e2k1/ath1/arabGenomeDups.html
The Complete Arabidopsis Transcriptome Micro Array (CATMA) http://www.catma.org/
DAS http://www.biodas.org/
The rice glossary http://www.riceweb.org/glossary/Terms.htm
Academia Sinica Plant Genome Center (ASPGC) http://genome.sinica.edu.tw/
The International Rice Genome Sequencing Project (IRGSP) http://rgp.dna.affrc.go.jp/IRGSP/index.html
Oryza sativa japonica genome-Monsanto http://www.rice-research.org
Oryza sativa japonica genome-Syngenta http://www.tmri.org
Oryza sativa indica genome http://btn.genomics.org.cn/rice
Rice Genome Research Program http://rgp.dna.affrc.go.jp/
The TIGR Rice Genome Project database http://www.tigr.org/tdb/e2k1/osa1/
The MIPS (Oryza sativa) database (MOsDB) http://mips.gsf.de/gams/rice/index.jsp
Dicty genome sequencing project http://dictybase.org/genomeseq.htm
CnidBase http://cnidbase.bu.edu
Wormbase http://www.wormbase.org
Arthropoda taxonomy http://www.ncbi.nlm.nih.gov/Taxonomy/
The Berkeley Drosophila Genome Project (BDGP) http://www.fruitfly.org
FlyBase http://flybase.bio.indiana.edu/
The Ensembl genome browser for the mosquito http://www.ensembl.org/Anopheles_gambiae/
The Ciona intenstinalis genome http://genome.jgi-psf.org/ciona4/ciona4.home.html
Ciona gene index from TIGR http://www.tigr.org/tdb/tgi/cingi/
Ciona EST project http://ghost.zool.kyoto-u.ac.jp/indexr1.html
Fugu resources from the U.S. Department of Energy Joint Genome Institute http://genome.jgi-psf.org/fugu6/fugu6.info.html
The Fugu Browser http://www.ensembl.org/Fugu_rubripes/
The Zebrafish Information Network http://zfin.org
The zebrafish genome at NCBI http://www.ncbi.nlm.nih.gov/genome/guide/zebrafish/index.html
The Medaka Genome Initiative http://medaka.dsp.jst.go.jp/MEPD
Fishbase http://ichtyonb1.mnhn.fr/search.html
A comparison of the mouse genomes http://143.48.7.130/cgi-bin/gbrowse?source=cse
Five mouse strains from Celera http://www.celera.com
The human/mouse homology map http://www.ncbi.nlm.nih.gov/Homology/index.html
The Mouse Genome Sequencing Consortium http://www.ensembl.org/Mus_musculus/
The Mouse Genome Informatics (MGI) database http://www.informatics.jax.org
The Mouse Tumor Biology (MTB) database http://tumor.informatics.jax.org/FMPro?-db=TumorInstance&-format=mtdp.html&-view
The Oak Ridge National Laboratory (ORNL) Genome Analysis Pipeline http://compbio.ornl.gov/tools/pipeline/

 

Tables

Table 16-7. Web servers that provide access to software for identifying repetitive elements in genomic DNA.

Program Description
RepeatFinder RepeatFinder a computational system for analysis of repetitive structure of genomic sequences
RepeatMasker University of Washington Genome Center
RepeatMasker Server at EMBL
RepeatMasker Server in Barcelona
RepeatMasker For zebrafish; at the Wellcome Trust Sanger Instiutte
Censor Server Genetic Information Research Institute
 

Table 16-8. Web-based databases of noncoding RNA.

Database Description
Large ribosomal subunit db structure of lsu ribosomal subunit RNA
Noncoding RNAs database Various RNA categories

Rfam

RNA family database at the Sanger Institute
Rfam RNA family database at Washington University
Ribosomal Database Project (RDP) Provides ribosome related data analysis, rRNAderived phylogenetic trees, and aligned and annotated rRNA sequences.
RNABase RNA structures
RNA Editing Web Site On all the various types of RNA editing
Small ribosomal subunit db A database on the structure of ssu ribosomal subunit RNA
Small RNA database Small RNAs are broadly defined as the RNAs not directly involved in protein synthesis.
tRNA sequences Compilation of 550 sequences of tRNAs and 3704 sequences tRNA genes (through 1998).

 

Table 16-9. Algorithms for finding genes in eukaryotic DNA. Abbreviation: HMM, hidden Markov model. A summary of sites is given in http://linkage.rockefeller.edu/wli/gene/.

Program Description
AAT Analysis and Automation Tool. Web based server
FgeneH Predicts exons using linear discriminant functions
FgeneSH ab initio gene finder
Gene Finder For human, mouse, Arabidopsis, and fission yeast
GeneParser 2 Identification of protein coding regions in genomic DNA
Genie Based on HMMs
GenLang syntactic pattern recognition system; uses computational linguistics to find genes
Genscan Based on HMMs; rule-based rather than homology based
GenTerpret From RabbitHutch Biotechnology Corporation
GlimmerM From TIGR
GlimmerM web server For Arabidopsis thaliana, Oryza sativa (rice), Plasmodium falciparum (malaria)
GRAIL One of the most widely used algorithms
MORGAN A decision tree system for finding genes in vertebrate DNA
PROCRUSTES Gene Recognition via Spliced Alignment
VEIL HMM for finding genes in vertebrate DNA
Xpound A probabilistic model for detecting coding regions

 

Table 16-10. Software for identifying features of promoter regions in genomic DNA.

Program Description
Ancient conserved untranslated DNA sequences (ACUTS) analyzes genes from metazoan species (essentially vertebrates, insects and nematods)
AliBaba2 predicts binding sites of transcription factor binding sites in an unknown DNA sequence
Eukaryotic Promoter Database (EPD) Annotated non-redundant collection of eukaryotic POL II promoters, for which the transcription start site has been determined experimentally
FastM develops models of transcriptional regulatory DNA units (e.g. promoters)
Gene Regulation Requires log-in
PlantProm Plant promoter database
rSNP_Guide Transcription factor binding site database
TRANSFAC Database of transcription factors, their genomic binding sites and DNA-binding profiles

 

Table 16-11. Web-based databases of chromosomes.

Resource Comment
Ensembl genome browser Ideograms for human (Chapter 17), mouse, rat, zebrafish, fugu, and mosquito
The Unified Database for Human Genome Mapping From the Weizmann Instutite
Human Chromosome-specific databases From the UK HGMP Resource Centre
Ideogram Album Human, mouse, and horse ideograms from the University of Washington
Human Chromosome Launchpad From the Oak Ridge National Laboratory

 

Table 16-12. Web resources for Trypanosome genomics.

Resource Comment
The Trypanosoma brucei Genome Network Sponsored by the Wellcome Trust
T.brucei omni Blast Server Sanger Institute
Trypanosoma cruzi Genome Initiative Information Server Trypanosoma cruzi Genome Initiative Information Server

 

Table 16-13. Web resources for Leishmania genomics.

Resource Comment
The Leishmania major Friedlin Genome Project At the Wellcome Trust Sanger Institute
SBRI Seattle Biomedical Research Institute
The European Leishmania major Friedlin Genome Sequencing Consortium A listing of participating laboratories

 

Table 16-14. Genomics resources for Plasmodium falciparum and malaria.

Resource Comment
PlasmoDB Main web resource for P. falciparum
Links page At NCBI
P. falciparum Genome Project At the Sanger Institute

 

Table 16-17. Genomics resources for Arabidopsis thaliana.

Resource Comment
TAIR The Arabidopsis Information Resource
Arabidopsis thaliana Database At TIGR
Arabidopsis thaliana Project At MIPS
Arabidopsis genome analysis At Cold Spring Harbor
SeedGenes Essential genes


Table 16-18. A variety of databases employ the model from the Generic Model Organism Project (GMOD)(http://www.gmod.org/).

Database Comment
EcoCyc Encyclopedia of Escherichia coli Genes and Metabolism
FlyBase Drosophila site
Mouse Genome Informatics Main mouse resource
Rat Genome Database (RGD) Rat resource
SGD See Chapter 15
TAIR The Arabidopsis Information Resource
Wormbase  

 

Table 16-19. The Genomics Unified Schema (GUS) platform (http://www.gusdb.org/) is used for some organism databases.

Database Comment
AllGenes Human and mouse gene index
EPConDB Endocrine Pancreas Consortium
GeneDB curated database for Schizosaccharomyces pombe, Leishmania major and Trypanosoma brucei
PlasmoDB genomic database for Plasmodium falciparum
RAD RNA abundance database

 

Table 16-20. Genomics resources for Dictyostelium discoideum.

Resource Comment
Dictybase a centralized source for information about Dictyostelium and related organisms
The Dictyostelium Genome Sequencing Project Baylor College of Medicine
Institute of Biochemistry I, Cologne
Dept. Genome Analysis, IMB Jena
Sanger Institute
Dictyostelium links NCBI

 

Table 16-26. Genomics resources for the mouse, Mus musculus.

Resource Comment
Mouse genome project Baylor College of Medicine
Mouse cytogenetic maps Mammalian Genetics Unit, Harwell UK
TBASE The Transgenic/Targeted Mutation Database
Database of Gene Knockouts At Bioscience.org
Mouse Genetics An online book by Lee Silver
Mouse mapping Genome Sciences Centre (Vancouver)
Trans-NIH Mouse Initiative Comprehensive NIH mouse genomics site
TIGR Mouse Gene Index At TIGR

 

Table 16-27. Genomics resources for the rat, Rattus norvegicus.

Resource Comment
Rat Genome Database Key rat genomics site
Ratmap Rat Genome Database
NIH Rat Genomics and Genetics Main NIH rat site
Rat Genome Resources Central NCBI rat site
TIGR Rat Gene Index At TIGR

 

Table 16-28. Genomics resources for non-human primates.

Resource Comment
Chimpanzee Genome Project Pan troglodytes at Baylor College of Medicine
Project Silver National Institute of Genetics (Japan)
Primate Cytogenetics Network Diploid numbers, karyotypes, ideograms
 

Return to Contents