the Gene Ontology

  • Open menus
  • Home
  • FAQ
  • Downloads
  • Ontologies
  • Annotations
  • Database
  • Mappings to GO
  • Teaching Resources
  • Other files
  • FTP and CVS downloads
  • Tools
  • Browsers
  • Microarray tools
  • Annotation tools
  • Other tools
  • Submit New Tools
  • Documentation
  • Introduction
  • Annotation Guide
  • Evidence Code Guide
  • Component Ontology
  • Function Ontology
  • Process Ontology
  • File Format Guide
  • GO Database Guide
  • GO Slim Guide
  • Meeting minutes
  • Editorial Style Guide
  • About GO
  • GO Consortium
  • Publications
  • Citation Policy
  • Mailing lists
  • Interest Groups
  • GO People
  • Funding
  • Acknowledgements
  • Newsletter
  • Projects
  • Cardiovascular
  • Immunology
  • Reference Genomes
  • Contact GO
  • Site Map

Current Annotations

  • Annotation Details and Downloads
  • Filtered files
  • Unfiltered files
  • gp2protein files

Annotation Details and Downloads

The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.

Ontology and annotation data is integrated in the mySQL and XML files. See the GO database guide for more information.

These files can also be downloaded via FTP; we recommend this method for the larger files, such as the UniProt dataset, as the web-based download may not work correctly.

Filtered Files

These files are taxon-specific and reflect the work of specific projects, primarily the model organisms database groups, to provide comprehensive, non-redundant annotation files for their organism. All the files in this table have been filtered using the annotation file QC checks script. A major component to the filtering is the requirement that particular taxon IDs can only be included within the association files provided by specific projects; please see the list of the authoritative groups for the major model organisms.

Statistics as of December 2, 2008

Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Anaplasma phagocytophilum HZ
TIGR
1290 3480
(3480 non-IEA)
8/16/2008
  • annotations [38.2 kb]
  • README
Agrobacterium tumefaciensstr. C58
PAMGO
83 250
(250 non-IEA)
9/12/2008
  • annotations [3.4 kb]
  • README
Arabidopsis thaliana
TAIR/TIGR
43268 112430
(92227 non-IEA)
12/2/2008
  • annotations [2.6 mb]
  • README
Bacillus anthracis Ames
TIGR
5282 13120
(13120 non-IEA)
8/16/2008
  • annotations [146.2 kb]
  • README
Bos taurus
GO Annotations @ EBI
22740 95298
(3715 non-IEA)
11/15/2008
  • annotations [1.1 mb]
  • README
Carboxydothermus hydrogenoformans Z-2901
TIGR
2611 6402
(6402 non-IEA)
8/16/2008
  • annotations [78.3 kb]
  • README
Caenorhabditis elegans
WormBase
17814 92197
(45448 non-IEA)
10/16/2008
  • annotations [790.0 kb]
  • README
Campylobacter jejuni RM1221
TIGR
1830 4658
(4658 non-IEA)
8/16/2008
  • annotations [58.8 kb]
  • README
Candida albicans
CGD
3901 18278
(6132 non-IEA)
12/2/2008
  • annotations [303.7 kb]
  • README
Clostridium perfringens ATCC13124
TIGR
2892 7465
(7465 non-IEA)
8/16/2008
  • annotations [90.3 kb]
  • README
Colwellia psychrerythraea 34H
TIGR
4752 12126
(12126 non-IEA)
11/15/2008
  • annotations [144.7 kb]
  • README
Coxiella burnetii RSA 493
TIGR
2033 5175
(5175 non-IEA)
8/16/2008
  • annotations [57.1 kb]
  • README
Danio rerio
ZFIN
14874 95358
(22194 non-IEA)
11/24/2008
  • annotations [1.5 mb]
  • README
Dehalococcoides ethenogenes 195
TIGR
1584 3958
(3958 non-IEA)
8/16/2008
  • annotations [46.6 kb]
  • README
Dictyostelium discoideum
dictyBase
7139 29836
(18907 non-IEA)
11/30/2008
  • annotations [406.4 kb]
  • README
Drosophila melanogaster
FlyBase
12472 70652
(54522 non-IEA)
11/21/2008
  • annotations [1.1 mb]
  • README
Escherichia coli
EcoCyc & EcoliHub
1568 6918
(6857 non-IEA)
10/25/2008
  • annotations [109.9 kb]
  • README
Ehrlichia chaffeensis Arkansas
TIGR
1092 2868
(2868 non-IEA)
8/16/2008
  • annotations [34.4 kb]
  • README
Gallus gallus
GO Annotations @ EBI
16376 63257
(2011 non-IEA)
11/15/2008
  • annotations [710.3 kb]
  • README
Geobacter sulfurreducens PCA
TIGR
3410 8857
(8857 non-IEA)
8/16/2008
  • annotations [98.8 kb]
  • README
Homo sapiens
GO Annotations @ EBI
36726 203788
(61458 non-IEA)
11/15/2008
  • annotations [2.9 mb]
  • README
Hyphomonas neptunium ATCC 15444
TIGR
3109 7829
(7829 non-IEA)
11/15/2008
  • annotations [109.0 kb]
  • README
Leishmania major
Sanger GeneDB
3573 11441
(28 non-IEA)
10/11/2008
  • annotations [142.1 kb]
  • README
Listeria monocytogenes 4b F2365
TIGR
2819 7028
(7028 non-IEA)
8/23/2008
  • annotations [84.3 kb]
  • README
Magnaporthe grisea
PAMGO
12876 51542
(29272 non-IEA)
10/11/2008
  • annotations [586.7 kb]
  • README
Methylococcus capsulatus Bath
TIGR
2920 7045
(7045 non-IEA)
8/16/2008
  • annotations [90.0 kb]
  • README
Mus musculus
MGI
18057 157755
(57717 non-IEA)
11/14/2008
  • annotations [1.8 mb]
  • README
Neorickettsia sennetsu Miyayama
TIGR
929 2439
(2439 non-IEA)
8/16/2008
  • annotations [29.7 kb]
  • README
Oomycetes
PAMGO
30 126
(126 non-IEA)
2/13/2008
  • annotations [2.3 kb]
  • README
Oryza sativa
Gramene
52082 64070
(64070 non-IEA)
8/30/2008
  • annotations [687.6 kb]
  • README
Protein Data Bank [multispecies]
GO Annotations @ EBI
34261 190801
(0 non-IEA)
9/18/2008
  • annotations [991.2 kb]
  • README
Plasmodium falciparum
Sanger GeneDB
2208 4654
(4654 non-IEA)
9/27/2008
  • annotations [76.5 kb]
  • README
Pseudomonas aeruginosa PAO1
PseudoCAP
1519 7350
(7350 non-IEA)
8/16/2008
  • annotations [129.0 kb]
  • README
Pseudomonas fluorescens Pf-5
TIGR
3691 9711
(9711 non-IEA)
11/15/2008
  • annotations [105.2 kb]
  • README
Pseudomonas syringae DC3000
TIGR
4008 10272
(10264 non-IEA)
8/22/2008
  • annotations [113.4 kb]
  • README
Pseudomonas syringae pv. phaseolicola 1448A
TIGR
3506 9036
(9036 non-IEA)
8/23/2008
  • annotations [111.4 kb]
  • README
Rattus norvegicus
RGD
20236 169127
(87760 non-IEA)
11/15/2008
  • annotations [3.4 mb]
  • README
Reactome [multispecies]
CSHL & EBI
258 6461
(6461 non-IEA)
8/16/2008
  • annotations [33.8 kb]
  • README
Saccharomyces cerevisiae
SGD
6346 84883
(44044 non-IEA)
12/2/2008
  • annotations [1.2 mb]
  • README
Schizosaccharomyces pombe
Sanger GeneDB
5268 34365
(29519 non-IEA)
11/10/2008
  • annotations [590.1 kb]
  • README
Shewanella oneidensis MR-1
TIGR
4843 13602
(13602 non-IEA)
8/16/2008
  • annotations [137.4 kb]
  • README
Silicibacter pomeroyi DSS-3
TIGR
4253 10869
(10869 non-IEA)
8/23/2008
  • annotations [133.3 kb]
  • README
Solanaceae
SGN
38 68
(68 non-IEA)
4/26/2008
  • annotations [2.4 kb]
  • README
Trypanosoma brucei
Sanger GeneDB
2978 10520
(10520 non-IEA)
11/29/2008
  • annotations [179.5 kb]
  • README
UniProt [multispecies]
GO Annotations @ EBI
4202941 31004970
(22040 non-IEA)
10/17/2008
  • annotations [208.9 mb]
  • README
Vibrio cholerae
TIGR
3858 9430
(9430 non-IEA)
8/16/2008
  • annotations [96.1 kb]
  • README
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files

Unfiltered Files

These files have not been filtered with the annotation file QC checks script. The most important difference between these files and the filtered files above is that gene products from certain taxa are not stripped out of the file; they may also contain annotations to obsolete terms or outdated IEA annotations. Please see the annotation file QC script documentation for full details of the checks performed.

Please note that if you use unfiltered files in conjunction with filtered files, there may be duplicated annotations.

Statistics as of December 2, 2008

Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Arabidopsis thaliana
GO Annotations @ EBI
21496 86427
(7647 non-IEA)
9/18/2008
  • annotations [1.2 mb]
  • README
Mus musculus
GO Annotations @ EBI
35546 207659
(71281 non-IEA)
9/18/2008
  • annotations [2.7 mb]
  • README
Rattus norvegicus
GO Annotations @ EBI
28113 126408
(13676 non-IEA)
9/18/2008
  • annotations [1.4 mb]
  • README
Danio rerio
GO Annotations @ EBI
30097 104200
(4831 non-IEA)
9/18/2008
  • annotations [1.1 mb]
  • README
Protein Data Bank [multispecies]
GO Annotations @ EBI
54447 304473
(0 non-IEA)
9/18/2008
  • annotations [1.6 mb]
  • README
Reactome [multispecies]
CSHL & EBI
2653 18500
(18500 non-IEA)
6/19/2008
  • annotations [153.8 kb]
  • README
UniProt [multispecies]
GO Annotations @ EBI
4587038 34169827
(472339 non-IEA)
10/16/2008
  • annotations [236.3 mb]
  • README
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files

In the tables above gene association counts are provided for all evidence codes and separately for everything except IEA, Inferred from Electronic Annotation. The IEA code means there has been no human involvement in the assignment of the association; see the GO evidence code documentation for more details.

Back to top

gp2protein files

The gp2protein directory contains files that map between model organism database object IDs and UniProt accessions.

Back to top


Open Biomedical Ontologies logo

Last modified Friday, 05-Sep-2008 15:10:14 PDT
Cite GO • Terms of use • GO helpdesk
Copyright © 1999-Friday, 05-Dec-2008 01:21:35 PST the Gene Ontology