Csolink Model
A high level datamodel of computational entities (serviceunits, errors, observable features, pathways, individuals, desired states, etc) and their associations.
Csolink Model is designed as a way of standardizing types and relational structures in knowledge graphs (KGs), where the KG may be either a property graph or RDF triple store.
The schema is expressed as a YAML, which is translated to:
- Individual pages for each class in the model, e.g https://w3id.org/csolink/vocab/Serviceunit
- An OWL ontology
- Python dataclasses, also available on PyPI
- ShEx (RDF shape constraints)
- graphql
- protobuf
- json-schema
- prefix-mapping (A simple mapping of prefix to IRI expansion)
- java classes
Datamodel
The schema assumes a property graph, where nodes represent individual entities, and edges represent relationship between entities. Csolink Model provides a schema for representing both nodes and edges.
The model itself can be divided into a few parts:
- Entities (subjects and objects)
- Predicates (relationships between core concepts)
- Associations (statements including evidence and provenance)
- Entity Slots (node properties)
- Edge Slots (edge properties)
Entities
A entity corresponds to a database entity or a concept, represented as a node in a property graph.
All typed entities are a sub-class of NamedThing.
Each entity has:
- its own unique stable URI
- mappings to other ontologies (SIO, SO, etc.)
- list of valid ID prefixes
These entity types are higher level terms that can be used to categorize nodes in a KG.
For more detailed typing, one can use specific terms from an ontology.
Associations
A typed association between two entities, usually supported by evidence and provenance. An association is represented as an edge/relationship between two nodes, in a property graph.
All edges are a sub-class of Association.
An association connects a subject node and an object node via a relation property. The nature of the association is defined based on the relation property.
Certain associations can have additional properties like provided_by, has_evidence, publications.
Slots
Slots are used to collectively refer to, both, node and edge properties.
There are two types of slots defined in the model,
- node property - all node properties are a sub-class of node property
- association slot - all edge properties are a sub-class of association slot
Browse the Csolink Model to explore all defined entities, associations, and slots.
Identifiers
See Csolink Model JSON-LD context for a list of CURIE prefix mappings.
These include prefix expansions such as:
"NCIT": "http://purl.obolibrary.org/obo/NCIT_",
Note: We do not curate these in Csolink Model. Rather we take these from upstream sources, via PrefixCommons. We specify a priority order of upstream sources in cases where conflicts may occur. See the default_curi_maps tag at the top of the csolink-model.yaml.
We also specify a small set of top-level prefix overrides via the prefixes tag at the top of the YAML.
Csolink Model representation
Csolink Model aims at representing knowledge in a graph form regardless of the graph representation used.
Following are some recommendations when attempting to use Csolink Model with each style of representation.
- Neo4J: see Mapping to Neo4j
- RDF: see Mapping to RDF
####