GapMind pathway completeness scores for genomes across the KBase pangenome database. GapMind assesses whether a genome can synthesize or catabolize various metabolites by searching for characterized enzymes and transporters.
DATABASE STATISTICS: - 463,729,001 pathway assessments - 80 metabolic pathways assessed per genome - Linked to GTDB species clades from kbase_ke_pangenome
TOP PATHWAYS BY ASSESSMENT COUNT: | Pathway | Assessments | Category | |------------------|--------------|-----------------| | phenylalanine | 18,603,662 | Amino acid | | arginine | 15,945,996 | Amino acid | | citrulline | 14,617,163 | Amino acid | | 4-hydroxybenzoate| 14,617,163 | Aromatic | | threonine | 14,617,163 | Amino acid | | tryptophan | 14,617,163 | Amino acid | | sucrose | 13,288,330 | Carbohydrate | | lactose | 13,288,330 | Carbohydrate |
SCORING SYSTEM: - nHi/nMed/nLo: Count of high/medium/low confidence gene hits - score: Overall pathway score (higher = more complete) - score_category: "present", "partial", or "not_present" - score_simplified: 1 (present), 0.5 (partial), 0 (not_present)
USAGE: Join with kbase_ke_pangenome.genome on genome_id and clade_name to analyze metabolic capabilities across species clades.
REFERENCE: Price et al. 2020 "GapMind: Automated Annotation of Amino Acid Biosynthesis" mSystems 5:e00291-20
URI: https://w3id.org/kbase/gapmind_pathways
Name: gapmind_pathways
Classes
| Class |
Description |
| GapmindPathway |
GapMind pathway completeness assessment for a genome |
Slots
| Slot |
Description |
| clade_name |
GTDB species clade identifier |
| genome_id |
RefSeq/GenBank genome assembly accession (GCF_/GCA_) |
| metabolic_category |
Broad metabolic category (amino acid, carbon, aromatic) |
| nHi |
Count of high-confidence gene hits |
| nLo |
Count of low-confidence gene hits |
| nMed |
Count of medium-confidence gene hits |
| pathway |
Metabolic pathway being assessed |
| score |
Overall pathway completeness score |
| score_category |
Categorical assessment of pathway completeness |
| score_simplified |
Simplified numeric score for aggregation |
| sequence_scope |
Whether assessing core or auxiliary pathway genes |
Enumerations
| Enumeration |
Description |
| MetabolicCategory |
Broad metabolic category for pathways |
| ScoreCategory |
Categorical assessment of pathway completeness based on GapMind scoring algor... |
| SequenceScope |
Sequence scope for pathway assessment - core or auxiliary genes |
Types
| Type |
Description |
| Boolean |
A binary (true or false) value |
| Curie |
a compact URI |
| Date |
a date (year, month and day) in an idealized calendar |
| DateOrDatetime |
Either a date or a datetime |
| Datetime |
The combination of a date and time |
| Decimal |
A real number with arbitrary precision that conforms to the xsd:decimal speci... |
| Double |
A real number that conforms to the xsd:double specification |
| Float |
A real number that conforms to the xsd:float specification |
| Integer |
An integer |
| Jsonpath |
A string encoding a JSON Path |
| Jsonpointer |
A string encoding a JSON Pointer |
| Ncname |
Prefix part of CURIE |
| Nodeidentifier |
A URI, CURIE or BNODE that represents a node in a model |
| Objectidentifier |
A URI or CURIE that represents an object in the model |
| Sparqlpath |
A string encoding a SPARQL Property Path |
| String |
A character string |
| Time |
A time object represents a (local) time of day, independent of any particular... |
| Uri |
a complete URI |
| Uriorcurie |
a URI or a CURIE |
Subsets