Skip to content

Enum: TokenEntityType

Types of entities in the vocabulary registry for tokenization

URI: https://w3id.org/kbase/nmdc_core/TokenEntityType

Permissible Values

Value Meaning Description
special None Special tokens for ML models including [PAD], [CLS], [SEP], [MASK], [UNK]
taxon None Taxonomic entities (species, genera, etc
go_term None Gene Ontology terms for functional trait features
compound None Chemical compounds for metabolomics/biochemical features
environmental None Environmental parameters for abiotic features

Slots

Name Description
entity_type Type of entity this token represents

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/kbase/nmdc_core

LinkML Source

name: TokenEntityType
description: Types of entities in the vocabulary registry for tokenization
from_schema: https://w3id.org/kbase/nmdc_core
rank: 1000
permissible_values:
  special:
    text: special
    description: Special tokens for ML models including [PAD], [CLS], [SEP], [MASK],
      [UNK]. These are system tokens not representing biological entities.
  taxon:
    text: taxon
    description: Taxonomic entities (species, genera, etc.) for taxonomy features
  go_term:
    text: go_term
    description: Gene Ontology terms for functional trait features
  compound:
    text: compound
    description: Chemical compounds for metabolomics/biochemical features
  environmental:
    text: environmental
    description: Environmental parameters for abiotic features