Sequence Reference

Note

This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the GKS Maturity Model.

Computational Definition

A sequence of nucleic or amino acid character codes.

GA4GH Digest

Prefix

Inherent

None

[‘refgetAccession’, ‘type’]

Information Model

Some SequenceReference attributes are inherited from Entity.

Field

Flags

Type

Limits

Description

id

string

0..1

The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another.

name

string

0..1

A primary name for the entity.

description

string

0..1

A free-text description of the Entity.

aliases

string

0..m

Alternative name(s) for the Entity.

extensions

Extension

0..m

A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.

type

string

1..1

MUST be “SequenceReference”

refgetAccession

string

1..1

A GA4GH RefGet identifier for the referenced sequence, using the sha512t24u digest.

residueAlphabet

string

0..1

The interpretation of the character codes referred to by the refget accession, where “aa” specifies an amino acid character set, and “na” specifies a nucleic acid character set.

sequence

sequenceString

0..1

A sequenceString that is a literal representation of the referenced sequence.

moleculeType

string

0..1

Molecule types as defined by RefSeq (see Table 1). MUST be one of “genomic”, “RNA”, “mRNA”, or “protein”.

circular

boolean

0..1

A boolean indicating whether the molecule represented by the sequence is circular (true) or linear (false).