Skip to content

UniRef50 Sequence Clusters

UniRef50 clustered protein sequences at 50% sequence identity threshold. Part of the UniProt Reference Clusters (UniRef) providing highly compressed clustered sets of sequences from UniProt. DATABASE STATISTICS: - Clusters and entities - Cross-references to UniProt entries CLUSTERING: UniRef50 clusters sequences at 50% identity and 80% coverage thresholds, providing maximum redundancy reduction while preserving functional diversity. Useful for fast homology searches and reduced database sizes. USAGE: Use for broad functional annotation or when searching large sequence sets where redundancy reduction is important.

URI: https://w3id.org/kbase/kbase_uniref50

Name: kbase_uniref50

Classes

Class Description
Cluster UniRef50 protein sequence cluster
ClusterMember Member of a UniRef50 cluster
CrossReference External database cross-reference for entities
Entity Protein entity in the UniRef50 database

Slots

Slot Description
cluster_id Unique cluster identifier (CDM UUID)
created Creation timestamp
data_source Source database
data_source_entity_id Original UniRef50 identifier
description Additional cluster description (optional)
entity_id Member entity identifier
entity_type Type of entity (protein)
is_representative Whether this member is the cluster representative
is_seed Whether this member was used as clustering seed
name Cluster name with representative protein description
protocol_id Clustering protocol identifier
score Membership score/similarity to representative
updated Last update timestamp
xref_type Type of cross-reference (UniProt, NCBI, etc
xref_value Cross-reference identifier value

Enumerations

Enumeration Description

Types

Type Description
Boolean A binary (true or false) value
Curie a compact URI
Date a date (year, month and day) in an idealized calendar
DateOrDatetime Either a date or a datetime
Datetime The combination of a date and time
Decimal A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double A real number that conforms to the xsd:double specification
Float A real number that conforms to the xsd:float specification
Integer An integer
Jsonpath A string encoding a JSON Path
Jsonpointer A string encoding a JSON Pointer
Ncname Prefix part of CURIE
Nodeidentifier A URI, CURIE or BNODE that represents a node in a model
Objectidentifier A URI or CURIE that represents an object in the model
Sparqlpath A string encoding a SPARQL Property Path
String A character string
Time A time object represents a (local) time of day, independent of any particular...
Uri a complete URI
Uriorcurie a URI or a CURIE

Subsets

Subset Description