the Gene Ontology

Search
  • Open menus
  • Home
  • FAQ
  • Downloads
    • Ontologies
    • Annotations
    • Database
    • Mappings to GO
    • Teaching Resources
    • Other files
    • FTP and CVS downloads
  • Tools
    • Browsers
    • Microarray tools
    • Annotation tools
    • Other tools
    • Submit New Tools
  • Documentation
    • Introduction
    • Ontology...
      • Ontology structure
      • Ontology relations
      • Cellular Component
      • Molecular Function
      • Biological Process
      • GO Slim Guide
      • OBO v1.2 format
    • Annotation...
      • Annotation Guide
      • Evidence Codes
      • Conventions
      • SOPs
      • File Format
    • Database...
      • GO Database Guide
      • Database schema
    • File Formats...
      • File Format Guide
      • Annotation
      • OBO v1.2
      • OBO v1.0
      • GO RDF-XML
    • Meeting minutes
  • About GO
    • GO Consortium
    • Publications
    • Citation Policy
    • Mailing lists
    • Interest Groups
    • GO People
    • Funding
    • Acknowledgements
    • Newsletter
  • Projects
    • Reference Genomes
    • Cardiovascular
    • Renal
  • Contact GO
    • News
    • RSS
    • twitter

Editorial Style Guide

The GO Style Guide introduces new users to (and reminds old users of) both the philosophy and the practicalities behind developing and maintaining GO. Its main purpose is to serve as a user manual for GO curators. You will find it more useful if you first read An Introduction to GO for more general background information about the GO project and how the ontology works. Information on annotating genes and gene products to GO can be found in the GO Annotation Guide and information on the structure and syntax of the GO files can be found in the GO File Format Guide.

Stylistic points and curator protocol can now be found on the GO wiki.

  • Understanding relationships in GO
  • The is a relationship
  • The part of relationship
  • The regulates relationship
  • Subsumption paths in GO
  • True path rule

Understanding relationships in GO

The GO ontologies are structured as a directed acyclic graph (DAG), which means that a child (more specialized) term can have multiple parents (less specialized terms). This makes GO a powerful system to describe biology, but can also create some pitfalls for curators. Keeping the following guidelines in mind should help you to avoid these problems.

A child term can have one of two different relationships to its parent(s): is a or part of. The same term can have different relationships to different parents; for example, the child 'GO term 3' may be an is a of its parent 'GO term 1' and a part of its other parent, 'GO term 2':

Parentage diagram

The is a relationship

In GO, an is a relationship means that the term is a subclass of its parent. For example, mitotic cell cycle is a cell cycle. It should not be confused with an 'instance' which is a specific example. For example, clogs are a subclass or is a of shoes, while the shoes I have on my feet now are an instance of shoes. GO, like most ontologies, does not use instances. The is a relationship is transitive, which means that if 'GO term A' is a 'GO term B', and 'GO term B' is a 'GO term C', 'GO term A' is a 'GO term C':

is-a transitivity

For example:
Terminal N-glycosylation is a subclass of terminal glycosylation.
Terminal glycosylation is a subclass of protein glycosylation.
Terminal N-glycosylation is a subclass of protein glycosylation.

The part of relationship

The use of part of in GO is more complex. There are four basic levels of restriction for a part of relationship:

different types of part-of

The first type has no restrictions. That is, no inferences can be made from the relationship between parent and child other than that the parent may or may not have the child as a part, and the child may or may not be a part of the parent.

The second type, necessarily is_part, means that wherever the child exists, it is as part of the parent. To give a biological example, replication fork is part of chromosome, so whenever replication fork occurs, it is a part of a chromosome, but chromosome does not necessarily have part replication fork.

Type three, necessarily has_part, is the exact inverse of type two; wherever the parent exists, it has the child as a part, but the child is not necessarily part of the parent. For example, nucleus always has chromosome as a part, but chromosome isn't necessarily part of the nucleus.

The final type is a combination of both two and three, has part and is part. An example of this is nuclear membrane is part of nucleus. So nucleus always has the part nuclear membrane, and nuclear membrane is always a part of the nucleus.

The part of relationship used in GO is usually type two, necessarily is_part. Note that part of types 1 and 3 are not used in GO, as they would violate the true path rule. Like is_a, part of is transitive, so that if 'GO term A' is part of 'GO term B', and 'GO term B' is part of 'GO term C', 'GO term A' is part of 'GO term C':

part-of transitivity

For example:
Laminin-1 is part of basal lamina.
Basal lamina is part of basement membrane.
Laminin-1 is part of basement membrane.

The regulates relationship

In GO, a regulates relationship means that the term is a process that modulates its parent process. For example, regulation of transcription regulates transcription. The regulation of a process is not a part of the process itself. For example, regulation of transcription describes the processes that affect the transcriptional machinery to modulate its activity.

The ontology editing tool OBO-Edit allows you to specify the necessity of relationships. The part of relationship used in GO, necessarily is_part, would correspond to part of, [inverse] necessarily true. For more information, see the OBO-Edit user guide.

For information on how these relationships are represented in the GO flat files, see the GO File Format Guide.

For technical information on the relationships used in GO and OBO, see the OBO relationships ontology.

Subsumption paths in GO

Not all of the GO ontologies currently have complete subsumption paths, that is, where every term has at least one path of is a relationships back to the top node. There are several reasons why completing the subsumption hierarchy is a vital aim for GO.

Ontologically correct

Logically, everything that exists is a kind of something else; this applies to all entities in GO.

More accurate queries and reasoning

Without full subsumption paths we cannot get complete answers to queries such as "show me all the different kinds of membrane", "show me all the different kinds of protein complex". Or put another way, without these extra paths, GO is not complete.

Better compatibility with ontology tools

The majority of ontology tools, with the exception of OBO-Edit, assume that the subsumption hierarchy for an ontology is complete, so completing the subsumption hierarchy for GO will make it more compatible with existing tools, such as Protege-Frames, Protege-OWL and SWOOP.

Improved visualisation

Having complete complete is a and part of paths in GO will allow the design of tools to display alternative is a and part_of views. This is a more intuitive way to view GO, as it detangles to complicated mixed relation view.

To this end, we have recently completed the subsumption hierarchy for the cellular component ontology. This was achieved by creating a set of new high-level terms ending in part. So for example, the term membrane was formerly only a part of child of cell; it had no is a parent:

cellular component
[i] cell
---[p] membrane

In the new structure, membrane is now part of cell, and is a cell part:

cellular component
[i] cell
---[p] cell part
------[i] membrane

[note that the is a relation is transitive, so every cell part is implicitly part of cell because cell part is part of cell]

So the is a path would be:

cellular component
[i] cell part
---[i] membrane

And the part of path:

cell
[p] membrane

We are working on completing the process ontology subsumption hierarchy, which we hope will be done in 2007.

For more information on the is a relationship, see the OBO Relations Ontology.

True path rule

The true path rule states that "the pathway from a child term all the way up to its top-level parent(s) must always be true". One of the implications of this is that the type of part of relationship used in GO, outlined more fully in the part of relationship documentation above, is restricted to those types where a child term must always be part of its parent.

Often, annotating a new gene product reveals relationships in an ontology that break the true path rule, or species specificity becomes a problem. In such cases, the ontology must be restructured by adding more nodes and connecting terms such that any path upwards is true. When a term is added to the ontology, the curator needs to add all of the parents and children of the new term.

This becomes clear with an example: consider how chitin metabolism is represented in the process ontology. Chitin metabolism is a part of cuticle synthesis in the fly and is also part of cell wall organization in yeast. This was once represented in the process ontology as follows:

cuticle synthesis
[i] chitin metabolism
cell wall biosynthesis
[i] chitin metabolism
---[i] chitin biosynthesis
---[i] chitin catabolism

The problem with this organization becomes apparent when one tries to annotate a specific gene product from one species. A fly chitin synthase could be annotated to chitin biosynthesis, and appear in a query for genes annotated to cell wall biosynthesis (and its children), which makes no sense because flies don't have cell walls.

This is the revised ontology structure which ensures that the true path rule is not broken:

chitin metabolism
[i] chitin biosynthesis
[i] chitin catabolism
[i] cuticle chitin metabolism
---[i] cuticle chitin biosynthesis
---[i] cuticle chitin catabolism
[i] cell wall chitin metabolism
---[i] cell wall chitin biosynthesis
---[i] cell wall chitin catabolism

The parent chitin metabolism now has the child terms cuticle chitin metabolism and cell wall chitin metabolism, with the appropriate catabolism and synthesis terms beneath them. With this structure, all the daughter terms can be followed up to chitin metabolism, but cuticle chitin metabolism terms do not trace back to cell wall terms, so all the paths are true. In addition, gene products such as chitin synthase can be annotated to nodes of appropriate granularity in both yeast and flies, and queries will yield the expected results.

Back to top


Open Biomedical Ontologies logo Last modified Monday, 15-Jun-2009 18:10:18 PDT
Help • Cite • Terms of use • Site Map
Copyright © 1999-Friday, 03-Jul-2009 20:18:14 PDT the Gene Ontology