Skip to content

GapMind Metabolic Pathway Completeness

GapMind pathway completeness scores for genomes across the KBase pangenome database. GapMind assesses whether a genome can synthesize or catabolize various metabolites by searching for characterized enzymes and transporters. DATABASE STATISTICS: - 463,729,001 pathway assessments - 80 metabolic pathways assessed per genome - Linked to GTDB species clades from kbase_ke_pangenome TOP PATHWAYS BY ASSESSMENT COUNT: | Pathway | Assessments | Category | |------------------|--------------|-----------------| | phenylalanine | 18,603,662 | Amino acid | | arginine | 15,945,996 | Amino acid | | citrulline | 14,617,163 | Amino acid | | 4-hydroxybenzoate| 14,617,163 | Aromatic | | threonine | 14,617,163 | Amino acid | | tryptophan | 14,617,163 | Amino acid | | sucrose | 13,288,330 | Carbohydrate | | lactose | 13,288,330 | Carbohydrate | SCORING SYSTEM: - nHi/nMed/nLo: Count of high/medium/low confidence gene hits - score: Overall pathway score (higher = more complete) - score_category: "present", "partial", or "not_present" - score_simplified: 1 (present), 0.5 (partial), 0 (not_present) USAGE: Join with kbase_ke_pangenome.genome on genome_id and clade_name to analyze metabolic capabilities across species clades. REFERENCE: Price et al. 2020 "GapMind: Automated Annotation of Amino Acid Biosynthesis" mSystems 5:e00291-20

URI: https://w3id.org/kbase/gapmind_pathways

Name: gapmind_pathways

Classes

Class Description
GapmindPathway GapMind pathway completeness assessment for a genome

Slots

Slot Description
clade_name GTDB species clade identifier
genome_id RefSeq/GenBank genome assembly accession (GCF_/GCA_)
metabolic_category Broad metabolic category (amino acid, carbon, aromatic)
nHi Count of high-confidence gene hits
nLo Count of low-confidence gene hits
nMed Count of medium-confidence gene hits
pathway Metabolic pathway being assessed
score Overall pathway completeness score
score_category Categorical assessment of pathway completeness
score_simplified Simplified numeric score for aggregation
sequence_scope Whether assessing core or auxiliary pathway genes

Enumerations

Enumeration Description
MetabolicCategory Broad metabolic category for pathways
ScoreCategory Categorical assessment of pathway completeness based on GapMind scoring algor...
SequenceScope Sequence scope for pathway assessment - core or auxiliary genes

Types

Type Description
Boolean A binary (true or false) value
Curie a compact URI
Date a date (year, month and day) in an idealized calendar
DateOrDatetime Either a date or a datetime
Datetime The combination of a date and time
Decimal A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double A real number that conforms to the xsd:double specification
Float A real number that conforms to the xsd:float specification
Integer An integer
Jsonpath A string encoding a JSON Path
Jsonpointer A string encoding a JSON Pointer
Ncname Prefix part of CURIE
Nodeidentifier A URI, CURIE or BNODE that represents a node in a model
Objectidentifier A URI or CURIE that represents an object in the model
Sparqlpath A string encoding a SPARQL Property Path
String A character string
Time A time object represents a (local) time of day, independent of any particular...
Uri a complete URI
Uriorcurie a URI or a CURIE

Subsets

Subset Description