Skip to content

Ontology operations using OAK

This demonstrates the p3 (PhenoPacketPython) command line utility for interfacing with OAK to do ontology operations.

First let's just see the operations:

%%bash
p3 --help
Usage: p3 [OPTIONS] COMMAND [ARGS]...

  Run the CLI.

Options:
  -v, --verbose
  -q, --quiet TEXT
  --help            Show this message and exit.

Commands:
  compare            Rewire all OntologyClass objects to update labels.
  list-terms         Rewire all OntologyClass objects to update labels.
  migrate-obsoletes  Rewire all OntologyClass objects that have an...
  normalize-curies   Rewire all OntologyClass objects to update labels.
  update-labels      Rewire all OntologyClass objects to update labels.
  validate           Validate a collection of phenopackets.
  viz                Rewire all OntologyClass objects to update labels.

List all terms used

This iterates through a whole object showing all OntologyClass objects.

(note: all commands accept any of YAML, JSON, or RDF)

%%bash
p3 list-terms -O yaml ../../../examples/Phenopacket-covid.yaml | head -20
- id: MONDO:0005015
  label: diabetes mellitus
- id: MONDO:0004994
  label: cardiomyopathy
- id: MONDO:0100096
  label: COVID-19
- id: LOINC:26474-7
  label: Lymphocytes [#/volume] in Blood
- id: NCIT:C67245
  label: Thousand Cells
- id: LOINC:26474-7
  label: Lymphocytes [#/volume] in Blood
- id: NCIT:C67245
  label: Thousand Cells
- id: NCIT:C80473
  label: Left Ventricular Assist Device
- id: NCIT:C722
  label: Oxygen
- id: NCIT:C67388
  label: Liter per Minute

Validating phenopackets using ontologies

LinkML provides a generic and expressive validation framework encompassing:

  • schema checks (e.g. required fields, types)
  • ontology checks (e.g. terms are valid, not obsolete, belong to static or dynamic subsets)

For this example, we will use a fake example with a bad label and an obsolete class.

THE PHENOPACKET HERE IS DELIBERATELY ERRONEOUS

We can check using validate:

%%bash
p3 validate ../../../tests/input/Phenopacket-migrate-example.yaml
## Validating ../../../tests/input/Phenopacket-migrate-example.yaml
## LinkML Validation Messages:
[ERROR] [0]
## Ontology Validation Messages: ('56844-4', 'CURIE')

ERROR:phenopackets.utilities.ontology_utilities:OntologyClass.id must be a CURIE: 56844-4

## Ontology Validation Messages: ('56844-4', 'CURIE')

ERROR:phenopackets.utilities.ontology_utilities:OntologyClass.id must be a CURIE: 56844-4

## Ontology Validation Messages: ('HP:0100637', 'obsolete')
Errors: 4

                                                  

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
Cell In[3], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'p3 validate ../../../tests/input/Phenopacket-migrate-example.yaml\n')

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/IPython/core/interactiveshell.py:2517, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2515 with self.builtin_trap:
   2516     args = (magic_arg_s, cell)
-> 2517     result = fn(*args, **kwargs)
   2519 # The code below prevents the output from being displayed
   2520 # when using magics with decorator @output_can_be_silenced
   2521 # when the last Python token in the expression is a ';'.
   2522 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/IPython/core/magics/script.py:154, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
    152 else:
    153     line = script
--> 154 return self.shebang(line, cell)

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/IPython/core/magics/script.py:314, in ScriptMagics.shebang(self, line, cell)
    309 if args.raise_error and p.returncode != 0:
    310     # If we get here and p.returncode is still None, we must have
    311     # killed it but not yet seen its return code. We don't wait for it,
    312     # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
    313     rc = p.returncode or -9
--> 314     raise CalledProcessError(rc, cell)

CalledProcessError: Command 'b'p3 validate ../../../tests/input/Phenopacket-migrate-example.yaml\n'' returned non-zero exit status 1.

This returns with error status due to the data not validating.

TODO make this output friendlier

Repairing phenopackets using ontologies

Ontologies are not static:

  • terms may become obsolete; in some cases a replacement will be indicated
  • labels may change

In the first case, the migrate-obsoletes command will auto-replace obsoletes that have a replaced-by

In the second, update-labels will modify any labels.

For these commands you can either pass in an ontology object using --ontology, OR let OAK figure out how to access the ontology. The default method is sqlite. There may be an initial lag.

Let's update the labels using the update-labels command:

!p3 update-labels ../../../tests/input/Phenopacket-migrate-example.yaml -o /tmp/fixed.yaml

We can see the changes that were made here:

!diff -u ../../../tests/input/Phenopacket-migrate-example.yaml /tmp/fixed.yaml

Note in the above the fake label for HP:0100637 was replaced by the current label obsolete Neoplasia of the nose but this indicates another issue - a class has been obsoleted since the phenopacket was made!

We can also apply the migrate-obsoletes command that will use term-replaced-by annotations:

!p3 migrate-obsoletes ../../../tests/input/Phenopacket-migrate-example.yaml -o /tmp/fixed.yaml

Let's see the changes that were made here:

!diff -u ../../../tests/input/Phenopacket-migrate-example.yaml /tmp/fixed.yaml

Visualizing Ontology terms

We can use OAK viz to visualize all terms used in a phenopacket (typically phenotypes or disease but could be any)

!p3 viz --ontology sqlite:obo:mondo ../examples/Phenopacket-covid.yaml -o output/covid.png

img