Skip to content

cfde_schema

A complete list of schematic specifications for the resources (TSV table files) that will be used to represent C2M2 DCC metadata prior to ingest into the C2M2 database system

URI: https://w3id.org/linkml/cfde Name: cfde_schema

Classes

Class Description
AnalysisType List of Ontology for Biomedical Investigations (OBI) CV terms used to describe analytic methods that generate C2M2 files
Anatomy List of Uber-anatomy ontology (UBERON) CV terms used to locate the origin of a C2M2 biosample within the physiology of its source or host organism
AssayType List of Ontology for Biomedical Investigations (OBI) CV terms used to describe types of experiment that generate C2M2 biosamples or results stored in C2M2 files
Biosample A tissue sample or other physical specimen
BiosampleDisease Association between a C2M2 biosample and a disease positively (e.g. cancer tumor tissue sample) OR negatively (e.g. cancer-free tissue sample) identified for that biosample
BiosampleFromSubject Association between a biosample and its source subject
BiosampleGene Association between a C2M2 biosample and an Ensembl gene especially relevant to it
BiosampleInCollection Association between a biosample and a (containing) collection
BiosampleSubstance Association between a C2M2 biosample and a PubChem substance experimentally associated with that biosample
Collection A grouping of C2M2 files, biosamples and/or subjects
CollectionAnatomy Association between an UBERON anatomical term and a C2M2 collection containing experimental resources directly related to the study of the anatomical concept described by that term
CollectionCompound Association between a compound and a C2M2 collection containing experimental resources directly related to the study of that compound
CollectionDefinedByProject (Shallow) association between a collection and a project that defined it
CollectionDisease Association between a disease and a C2M2 collection containing experimental resources directly related to the study of that disease
CollectionGene Association between a gene and a C2M2 collection containing experimental resources directly related to the study of that gene
CollectionInCollection Association between a containing collection (superset) and a contained collection (subset)
CollectionPhenotype Association between a phenotype and a C2M2 collection containing experimental resources directly related to the study of that phenotype
CollectionProtein Association between a protein and a C2M2 collection containing experimental resources directly related to the study of that protein
CollectionSubstance Association between a substance and a C2M2 collection containing experimental resources directly related to the study of that substance
CollectionTaxonomy Association between a taxon and a C2M2 collection containing experimental resources directly related to the study of that taxon
Compound List of (i) GlyTouCan terms or (ii) PubChem 'compound' terms (normalized chemical structures) referenced in this submission; (ii) will include all PubChem 'compound' terms associated with any PubChem 'substance' terms (specific formulations of chemical materials) directly referenced in this submission, in addition to any 'compound' terms directly referenced
DataType List of EDAM CV 'data:' terms used to describe data in C2M2 files
Dcc The Common Fund program or data coordinating center (DCC, identified by the given project foreign key) that produced this C2M2 instance
Disease List of Disease Ontology terms used to describe diseases recorded in association with C2M2 subjects or biosamples
File A stable digital asset
FileDescribesBiosample Association between a biosample and a file containing information about that biosample
FileDescribesCollection Association between a summary file and an entire collection described by that file
FileDescribesSubject Association between a subject and a file containing information about that subject
FileFormat List of EDAM CV 'format:' terms used to describe formats of C2M2 files
FileInCollection Association between a file and a (containing) collection
Gene List of Ensembl genes directly referenced in this C2M2 submission
IdNamespace A table listing identifier namespaces registered by the DCC submitting this C2M2 instance
NcbiTaxonomy List of NCBI Taxonomy Database IDs identifying taxa used to describe C2M2 subjects
Phenotype List of Human Phenotype Ontology terms used to describe phenotypes recorded in association with C2M2 subjects
PhenotypeDisease Association between a Human Phenotype Ontology term and a Disease Ontology term identifying a disease especially relevant to it
PhenotypeGene Association between a Human Phenotype Ontology term and an Ensembl gene especially relevant to it
Project A node in the C2M2 project hierarchy subdividing all resources described by this DCC's C2M2 metadata
ProjectInProject Association between a child project and its parent
Protein List of UniProtKB proteins directly referenced in this C2M2 submission
ProteinGene Association between a UniProtKB protein term and an Ensembl term identifying a gene encoding that protein
Subject A biological entity from which a C2M2 biosample can in principle be generated
SubjectDisease Association between a C2M2 subject and a disease positively OR negatively clinically identified in that subject
SubjectInCollection Association between a subject and a (containing) collection
SubjectPhenotype Association between a C2M2 subject and a phenotype positively OR negatively clinically identified for that subject
SubjectRace Identification of a C2M2 subject with one or more self-selected races
SubjectRoleTaxonomy Trinary association linking IDs representing (1) a subject, (2) a subject_role (a named organism-level constituent component of a subject, like 'host', 'pathogen', 'endosymbiont', 'taxon detected inside a microbiome subject', etc.) and (3) a taxonomic label (which is hereby assigned to this particular subject_role within this particular subject)
SubjectSubstance Association between a C2M2 subject and a PubChem substance experimentally associated with that subject
Substance List of PubChem 'substance' terms (specific formulations of chemical materials) directly referenced in this C2M2 submission

Slots

Slot Description
abbreviation A very short display label for this project
age_at_enrollment The age in years (with a fixed precision of two digits past the decimal point) of this subject when they were first enrolled in the primary project within which they were studied
age_at_sampling The age in years (with a fixed precision of two digits past the decimal point) of this subject when this biosample was taken
analysis_type An OBI CV term ID describing the type of analytic operation that generated this file
anatomy An UBERON CV term ID used to locate the origin of this biosample within the physiology of its source or host organism
assay_type An OBI CV term ID describing the type of experiment that generated the results summarized by this file
association_type The relationship between this biosample and this disease (e.g. 'observed' or '(tested for, but) not observed')
biosample_id_namespace Identifier namespace for this biosample
biosample_local_id The ID of this biosample
bundle_collection_id_namespace If this file is a bundle encoding more than one sub-file, this field gives the id_namespace of a collection listing the bundle's sub-file contents; null otherwise
bundle_collection_local_id If this file is a bundle encoding more than one sub-file, this field gives the local_id of a collection listing the bundle's sub-file contents; null otherwise
child_project_id_namespace ID of the identifier namespace for the child in this parent-child project pair
child_project_local_id The ID of the contained (child) project
clade The phylogenetic level (e.g. species, genus) assigned to this taxon
collection_id_namespace Identifier namespace for this collection
collection_local_id The ID of this collection
compound A PubChem or GlyTouCan term ID describing this compound
compression_format An EDAM CV term ID identifying the compression format of this file (e.g. gzip or bzip2): null if this file is not compressed
contact_email Email address of this DCC's primary technical contact
contact_name Name of this DCC's primary technical contact
creation_time An ISO 8601 -- RFC 3339 (subset)-compliant timestamp documenting this file's creation time: YYYY-MM-DDTHH:MM:SS±NN:NN
data_type An EDAM CV term ID identifying the type of information stored in this file (e.g. RNA sequence reads): null if is_bundle is set to true
dbgap_study_id The name of a dbGaP study ID governing access control for this file, compatible for comparison to RAS user-level access control metadata
dcc_abbreviation A very short display label for this contact's DCC
dcc_description A human-readable description of this DCC
dcc_name A short, human-readable, machine-read-friendly label for this DCC
dcc_url URL of the front page of the website for this DCC
description A human-readable description of this project
disease A Disease Ontology CV term ID describing this disease
ethnicity A CFDE CV category characterizing the self-reported ethnicity of this subject
file_format An EDAM CV term ID identifying the digital format of this file (e.g. TSV or FASTQ): if this file is compressed, this should be its uncompressed format
file_id_namespace Identifier namespace for this file
file_local_id The ID of this file
filename A filename with no prepended PATH information
gene An Ensembl term ID describing this gene
granularity A CFDE CV category characterizing this subject by multiplicity
has_time_series_data Does this collection contain time-series data? (allowed values: [true
id The identifier for this DCC, issued by the CFDE-CC
id_namespace A CFDE-cleared identifier representing the top-level data space containing this file [part 1 of 2-component composite primary key]
local_id An identifier representing this file, unique within this id_namespace [part 2 of 2-component composite primary key]
md5 (allowed) MD5 checksum for this file [sha256, md5 cannot both be null]
mime_type A MIME type describing this file
organism An NCBI Taxonomy Database ID identifying this gene's source organism (e.g. 'NCBI:txid9606')
parent_project_id_namespace ID of the identifier namespace for the parent in this parent-child project pair
parent_project_local_id The ID of the containing (parent) project
persistent_id A persistent, resolvable (not necessarily retrievable) URI or compact ID permanently attached to this file
phenotype A Human Phenotype Ontology CV term ID describing this phenotype
project_id_namespace The id_namespace of the primary project within which this file was created [part 1 of 2-component composite foreign key]
project_local_id The local_id of the primary project within which this file was created [part 2 of 2-component composite foreign key]
protein A UniProtKB term ID describing this protein
race A race self-identified by this subject
role_id The ID of the role assigned to this organism-level constituent component of this subject
sex A CFDE CV category characterizing the physiological sex of this subject
sha256 (preferred) SHA-256 checksum for this file [sha256, md5 cannot both be null]
size_in_bytes The size of this file in bytes
subject_id_namespace Identifier namespace for this subject
subject_local_id The ID of this subject
subset_collection_id_namespace ID of the identifier namespace corresponding to the C2M2 submission containing the subset collection
subset_collection_local_id The ID of the subset collection
substance A PubChem term ID describing this substance
superset_collection_id_namespace ID of the identifier namespace corresponding to the C2M2 submission containing the superset collection
superset_collection_local_id The ID of the superset collection
synonyms A list of synonyms for this term as identified by the OBI metadata
taxon An NCBI Taxonomy Database ID identifying this taxon
taxonomy_id An NCBI Taxonomy Database ID identifying this taxon
uncompressed_size_in_bytes The total decompressed size in bytes of the contents of this file: null if this file is not compressed

Enumerations

Enumeration Description
AssociationTypeEnum None
EthnicityEnum None
GranularityEnum None
RaceEnum None
RoleIdEnum None
SexEnum None

Subsets

Subset Description