Sequence Reference

Note

This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the GKS Maturity Model.

Computational Definition

A sequence of nucleic or amino acid character codes.

GA4GH Digest

Prefix	Inherent
None	[‘refgetAccession’, ‘type’]

Information Model

Some SequenceReference attributes are inherited from Entity.

Field	Flags	Type	Limits	Description
id		string	0..1	The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another.
name		string	0..1	A primary name for the entity.
description		string	0..1	A free-text description of the Entity.
aliases	⋮	string	0..m	Alternative name(s) for the Entity.
extensions	⋮	Extension	0..m	A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.
type		string	1..1	MUST be “SequenceReference”
refgetAccession		string	1..1	A GA4GH RefGet identifier for the referenced sequence, using the sha512t24u digest.
residueAlphabet		string	0..1	The interpretation of the character codes referred to by the refget accession, where “aa” specifies an amino acid character set, and “na” specifies a nucleic acid character set.
sequence		sequenceString	0..1	A sequenceString that is a literal representation of the referenced sequence.
moleculeType		string	0..1	Molecule types as defined by RefSeq (see Table 1). MUST be one of “genomic”, “RNA”, “mRNA”, or “protein”.
circular		boolean	0..1	A boolean indicating whether the molecule represented by the sequence is circular (true) or linear (false).