Skip to content

Using Pydantic with Phenopackets

Pydantic is a popular and fast python library for working with object models in Python.

LinkML auto-generates pydantic. This repo contains pre-generated pydantic models for phenopackets (in addition to an alternate python representation, using classic dataclasses).

import json

from phenopackets.pydantic.model import *

Constructing a Phenopacket

Let's make a very basic phenopacket with a single phenotypic feature.

We'll do the feature first

pf1 = PhenotypicFeature(
    type=OntologyClass(id="HP:0030084", label="Clinodactyly"),
    onset=TimeElement(age=Age(iso8601duration="P5Y")
                      ))

then the packet

pkt = Phenopacket(
    id="PP1",
    phenotypicFeatures=[pf1],
    metaData=MetaData(
        created="2021-01-01",
    )
)

Exporting to JSON

Pydantic gives us a simple way to export to JSON

print(pkt.model_dump_json(indent=2))
{
  "biosamples": [],
  "diseases": [],
  "files": [],
  "id": "PP1",
  "interpretations": [],
  "measurements": [],
  "medicalActions": [],
  "metaData": {
    "created": "2021-01-01",
    "createdBy": null,
    "externalReferences": [],
    "phenopacketSchemaVersion": null,
    "resources": [],
    "submittedBy": null,
    "updates": []
  },
  "phenotypicFeatures": [
    {
      "description": null,
      "evidence": [],
      "excluded": null,
      "modifiers": [],
      "onset": {
        "age": {
          "iso8601duration": "P5Y"
        },
        "ageRange": null,
        "gestationalAge": null,
        "interval": null,
        "ontologyClass": null,
        "timestamp": null
      },
      "resolution": null,
      "severity": null,
      "type": {
        "id": "HP:0030084",
        "label": "Clinodactyly"
      }
    }
  ],
  "subject": null
}

Importing from JSON

Everything works in reverse too

import json
json_str = pkt.model_dump_json(indent=2)
pkt2 = Phenopacket(**json.loads(json_str))
print(pkt2.phenotypicFeatures[0].type.label)
Clinodactyly

Validation

One advantage of Pydantic is that it gives us validation at the time of object creation

(It also provides many type hints in your IDE, which can be very helpful)

Remember, all of this comes from the LinkML schema - we didn't manually author any pydantic.

Let's try and make a feature without an HPO ID:

pf1 = PhenotypicFeature(
    type=OntologyClass(label="Clinodactyly"),
    onset=TimeElement(age=Age(iso8601duration="P5Y")
                      ))

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[25], line 2
      1 pf1 = PhenotypicFeature(
----> 2     type=OntologyClass(label="Clinodactyly"),
      3     onset=TimeElement(age=Age(iso8601duration="P5Y")
      4                       ))

File ~/Library/Caches/pypoetry/virtualenvs/phenopackets-JmUWBwH2-py3.9/lib/python3.9/site-packages/pydantic/main.py:176, in BaseModel.__init__(self, **data)
    174 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    175 __tracebackhide__ = True
--> 176 self.__pydantic_validator__.validate_python(data, self_instance=self)

ValidationError: 1 validation error for OntologyClass
id
  Field required [type=missing, input_value={'label': 'Clinodactyly'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.7/v/missing

Perfect! This is exactly what we want.