Skip to content

Loading Data

We have provided a number of modules to automate loading external resources into GraphKB. Users can pick and choose which resources they would like to load or use the snakemake pipeline to load them all (see instructions here). This will download and load content by default into your newly created GraphKB instance.

Most popular resources which have pre-built loaders provided for GraphKB are listed below. However, for an exhaustive list of all possible loaders, please see the loader project itself.

Custom Content

If you have your own instance of GraphKB and would like to transform your existing knowledge base to load it into GraphKB please look at the other knowledge base loaders for examples. There are some commonly used helper modules and functions available in the code base to make this process simpler. You can see documentation for individual loaders grouped with their loader (See their corresponding README.md).

src/
`--loader/
  |-- index.js
  `-- README.md

If you have any issues or questions please make an issue in the loaders repo.

Loading Content

For convenience, a snakemake workflow is included to run all available loaders in an optimal order to initialize the content in a new instance of GraphKB. This is done via python snakemake. To set up snakemake in a virtual environment run the following

python3 -m venv venv
source venv/bin/activate
pip install -U pip setuptools wheel
pip install snakemake

Then the workflow can be run as follows (single core by default but can be adjusted depending on your server settings)

snakemake -j 1

default workflow

You will want to pass snakemake the specific GraphKB instance you are working with as well as the credentials of the user that will be uploading. If you have followed the docker install demo instructions this might looks something like this

snakemake -j 1 \
  --config gkb_user='graphkb_importer' \
  gkb_pass='secret' \
  gkb_url='http://localhost:8080/api'

The COSMIC and DrugBank options require licensing and are therefore not run by default. If you have a license to use them then you can include one or both of them by providing email and password as config parameters

snakemake -j 1 \
  --config drugbank_email="YOUR EMAIL" \
  drugbank_password="YOUR PASSWORD" \
  cosmic_email="YOUR EMAIL" \
  cosmic_password="YOUR PASSWORD"

full workflow

Back to top