Query Basics¶
Documentation for the API can be seen via its OpenAPI specification at /api/spec
. Here we will cover just the query endpoint which is the most commonly used endpoint as it is used for all searches. The /query
endpoint accepts a JSON body in a POST request. This is how the user passes filters and other search-related parameters. We will define a few of the important fields and concepts that are used below.
All the running examples below use the python GraphKB adapter. This assumes the user has aready initialized the connector and logged in as shown below (using the demo database and credentials).
First install the adapter
!pip install graphkb
Then set up the connector
from graphkb import GraphKBConnection
GKB_API_URL = 'https://pori-demo.bcgsc.ca/graphkb-api/api'
GKB_USER = 'colab_demo'
GKB_PASSWORD = 'colab_demo'
graphkb_conn = GraphKBConnection(GKB_API_URL)
graphkb_conn.login(GKB_USER, GKB_PASSWORD)
Important Fields and Concepts¶
Query Target¶
The target is the class/table that the users wishes to query. If it is at the top level of the request body then it is also the type of record which will be returned. For example to get a list of all publications in GraphKB. We limit this to the first 3 publications for the purposes of this demo
graphkb_conn.query({
'target': 'Publication'
}, paginate=False, limit=3)
Filters¶
Any field that is accessible with the current users permissions level can be queried via this endpoint. Most commonly users want to filter on this like a records name or source ID (ID in the external database it was imported from). Continuing our example from above let's search for publications with the word "cancer" in them.
Note: The current full text index only searches on word and word prefixes. Future iterations will support a full lucene index.
graphkb_conn.query({
'target': 'Publication',
'filters': {'name': 'cancer', 'operator': 'CONTAINSTEXT'}
}, paginate=False, limit=3)
You can also filter on multiple conditions. To do this we nest filters in an object which uses a single AND/OR property with a list of regular conditions. For example if we want to find diseases with the name "cancer" or "carcinoma"
graphkb_conn.query({
'target': 'Disease',
'filters': {
'OR': [
{'name': 'cancer'},
{'name': 'carcinoma'},
]
},
})
The operator can be omitted here since =
is the default operator. We can also combine conditions with AND
graphkb_conn.query({
'target': 'Disease',
'filters': {
'AND': [
{'name': 'cancer', 'operator': 'CONTAINSTEXT'},
{'name': 'pancreatic', 'operator': 'CONTAINSTEXT'},
]
},
}, paginate=False, limit=3)
The above will look for diseases that have both 'cancer' and 'pancreatic' in the name.
Subquery Filters¶
Sometimes we would like to filter records on a linked field (essentially a foreign key). We can do this with subquery filters.
graphkb_conn.query({
'target': 'Disease',
'filters': {
'source': {'target': 'Source', 'filters': {'name': 'disease ontology'}}
},
}, paginate=False, limit=3)
Above we are only returning disease records that have been imported from the disease ontology.
Return Properties (Fields)¶
The return fields property allows the user to specify what they would like to return. This can mean returning a subset of fields for a large query to improve the speed of the client digesting the data, or it can be used to de-nest fields. By default the query will return only the immediate properties of the class being queries. This means that linked fields will be listed as their record ID. De-nesting these fields allows you to return them without additional queries.
graphkb_conn.query({
'target': 'Disease',
'filters': {
'AND': [
{'source': {'target': 'Source', 'filters': {'name': 'disease ontology'}}},
{'name': 'cancer'}
],
},
})
We probably are not interested in all of these fields so let's pick a few to return.
graphkb_conn.query({
'target': 'Disease',
'filters': {
'AND': [
{'source': {'target': 'Source', 'filters': {'name': 'disease ontology'}}},
{'name': 'cancer'}
],
},
'returnProperties': ['name', 'source', 'sourceId', 'alias', 'deprecated']
})
The new return looks much more reasonable. However the source field right now is a seperate record ID. This means with the current query we would have to fetch that record separately if we want to see details about it. This can be done in a single query with the nested return properties. Simply delimit properties and sub-properties with a period.
graphkb_conn.query({
'target': 'Disease',
'filters': {
'AND': [
{'source': {'target': 'Source', 'filters': {'name': 'disease ontology'}}},
{'name': 'cancer'}
],
},
'returnProperties': ['name', 'source.name', 'sourceId', 'alias', 'deprecated']
})