Skip to content

Validation using LinkML Validate

We can use the generic LinkML validator against the model to check for any issues with the data.

Note this does not use any phenopackets specific code.

Example of successful validation

We will pick one of the examples from phenopacket-tools and validate it.

%%bash
linkml-validate -s ../../../src/phenopackets/schema/phenopackets.yaml  ../../../examples/bethleham-myopathy.json
No issues found

Not surprising!

Example of validation failure

Let's take another example:

%%bash
linkml-validate -s ../../../src/phenopackets/schema/phenopackets.yaml  ../../../examples/retinoblastoma.json
[ERROR] [../../../examples/retinoblastoma.json/0] '62280955' is not of type 'integer' in /interpretations/0/diagnosis/genomicInterpretations/0/variantInterpretation/variationDescriptor/variation/copyNumber/allele/sequenceLocation/sequenceInterval/endNumber/value
[ERROR] [../../../examples/retinoblastoma.json/0] '26555377' is not of type 'integer' in /interpretations/0/diagnosis/genomicInterpretations/0/variantInterpretation/variationDescriptor/variation/copyNumber/allele/sequenceLocation/sequenceInterval/startNumber/value
[ERROR] [../../../examples/retinoblastoma.json/0] '1' is not of type 'integer' in /interpretations/0/diagnosis/genomicInterpretations/0/variantInterpretation/variationDescriptor/variation/copyNumber/number/value
[ERROR] [../../../examples/retinoblastoma.json/0] '48941648' is not of type 'integer' in /interpretations/0/diagnosis/genomicInterpretations/1/variantInterpretation/variationDescriptor/variation/allele/sequenceLocation/sequenceInterval/endNumber/value
[ERROR] [../../../examples/retinoblastoma.json/0] '48941647' is not of type 'integer' in /interpretations/0/diagnosis/genomicInterpretations/1/variantInterpretation/variationDescriptor/variation/allele/sequenceLocation/sequenceInterval/startNumber/value
[ERROR] [../../../examples/retinoblastoma.json/0] '48367512' is not of type 'integer' in /interpretations/0/diagnosis/genomicInterpretations/1/variantInterpretation/variationDescriptor/vcfRecord/pos

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
Cell In[4], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'linkml-validate -s ../../../src/phenopackets/schema/phenopackets.yaml  ../../../examples/retinoblastoma.json\n')

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/IPython/core/interactiveshell.py:2517, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2515 with self.builtin_trap:
   2516     args = (magic_arg_s, cell)
-> 2517     result = fn(*args, **kwargs)
   2519 # The code below prevents the output from being displayed
   2520 # when using magics with decorator @output_can_be_silenced
   2521 # when the last Python token in the expression is a ';'.
   2522 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/IPython/core/magics/script.py:154, in ScriptMagics._make_script_magic.<locals>.named_script_magic(line, cell)
    152 else:
    153     line = script
--> 154 return self.shebang(line, cell)

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/IPython/core/magics/script.py:314, in ScriptMagics.shebang(self, line, cell)
    309 if args.raise_error and p.returncode != 0:
    310     # If we get here and p.returncode is still None, we must have
    311     # killed it but not yet seen its return code. We don't wait for it,
    312     # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
    313     rc = p.returncode or -9
--> 314     raise CalledProcessError(rc, cell)

CalledProcessError: Command 'b'linkml-validate -s ../../../src/phenopackets/schema/phenopackets.yaml  ../../../examples/retinoblastoma.json\n'' returned non-zero exit status 1.

In actual fact, this example is using strings when it should be using numbers

TODO fix this upstream