View on GitHub

ChromeQC

Summarize library quality of 10x Genomics Chromium linked reads

ChromeQC: Summarize library quality of 10x Genomics Chromium linked reads

This tool provides a quick report on the quality of a 10x Genomics Chromium linked reads library. The report summarizes the sizes of the molecules, the number of reads per molecule, the number of molecules per barcode, and the amount of DNA per barcode. The idea is to provide a FastQC-like tool in terms of speed but to contain information provided by the Summary page of the Loupe software of 10x Genomics. ChromeQC is developed in Python 3, R, AWK, RMarkdown, and Flexdashboard, and uses BWA-MEM for read alignment.

Usage

-w --whitelist     : default='whitelist_barcodes', type=str
-k --subsample_size: default=4000                , type=int
-i --in            : default='-'                 , type=str
-o --out           : default='stdout'            , type=str
-s --seed          : default=1334                , type=int
-m --max_read_pairs: default=-1                  , type=int  , note: -1 means all read pairs
-p --stats_out_path: default='.'                 , type=str  , note: the directory needs to be created already
-v --verbose       : default=False               , no value  , note: If supplied, will be set to true, else will be false.

Examples

python3 random_sampling_from_whitelist.py -w ../data/whitelist_barcodes.txt.gz -i ../data/read-RA_si-GAGTTAGT_lane-001-chunk-0002.fastq.gz -v

The pipeline starts with raw FASTQ files of interleaved paired end reads provided by the 10x Chromium platform.

Dependencies

pip3 install -r requirements.txt
brew bundle

BWA or Minimap2
Pysam
Python 3
Samtools

Prerequisites

The analysis and report will be created using R, the Tidyverse, RMarkdown, and Flexdashboard. Familiarity with some of these tools is useful, but not necessary to participate in this project. Non-technical participants are welcome to design the aesthetics of the report, prepare and deliver the presentation, and coordinate writing a brief paper about the tool.

Team Lead: Shaun Jackman

sjackman@gmail.com

@sjackman

Grad Student

BC Cancer Agency Genome Sciences Centre