CID - Chromatin Interaction Discovery for ChIA-PET and HiChIP data
Installation
CID:
Test installation by running:
java -Xmx32G -jar gem.jar CID -h
MICC:
Test installation by running the following line in R:
library(MICC)
CID usage
java -Xmx10G -jar gem.jar CID --h
CID - Chromatin Interaction Discovery, version 1.0
Usage: java -Xmx32G -jar gem.jar CID --data INPUT-BEDPE-FILE --g CHROMOSOME-SIZE-FILE [--micc NUM] [--ex CHR] [--out FILE-NAME-PREFIX]
Options:
--data INPUT-BEDPE-FILE
the file path to the aligned paired-end bedpe or bedpe.gz file
--g CHROMOSOME-SIZE-FILE
the file path to the chromosome size file
[--micc NUM]
the minimum PET count of candidate interactions for the MICC input BEDPE file. Default NUM = 1. For large datasets, try NUM = 2, 3, or 4.
[--ex CHR]
the list of chromosomes to exclude. Default CHR = M. Multiple chromosomes can be separated by commas, e.g., CHR = M,Y
[--out FILE-NAME-PREFIX]
the output file name prefix
[--h]
Show this help message and exit
Example
- Install CID and MICC as described in the previous section
- Download the chromosome size file for genome hg19: hg19.chrom.sizes file
Chromosome size files for other genomes can be downloaded from the GEM software page or from the UCSC website.
- Download and unzip the example BEDPE file from a POLR2A ChIA-PET data set in K562 cells: BEDPE file
- Call CID:
java -Xmx10G -jar gem.jar CID --data Ruan.K562.POLR2A.rep1.chr20.rmdup.bedpe.gz --g hg19.chrom.sizes --out test-run
- Note: if "out-of-memery" error occurs, increase the value in the -Xmx option for java.
- Call MICC on CID output BEDPE file by running the following lines in R:
library(MICC)
cid.bedpe <- read.table("test-run.bedpe", sep = "\t", header = FALSE)
MICCoutput(cid.bedpe, "test-run-micc-out.txt")
- genomic interactions with significance measure (FDR) can be found in test-run-micc-out.txt
CID output
CID generates a BEDPE file containing identified genomic interactions (*.bedpe).
BEDPE is a header-less tab-delimited text file, following the
BEDPE format, with the following columns:
- genomic location of anchor A - chromosome
- genomic location of anchor A - start coordinate
- genomic location of anchor A - end coordinate
- genomic location of anchor B - chromosome
- genomic location of anchor B - start coordinate
- genomic location of anchor B - end coordinate
- PET count between two anchor regions
- total PET count in anchor A (left anchor region)
- total PET count in anchor B (right anchor region)
MICC output
MICC creates one output file (in the example above,
test-run-micc-out.txt). It is a tab-delimited text file with the following columns:
- genomic location of anchor A - chromosome
- genomic location of anchor A - start coordinate
- genomic location of anchor A - end coordinate
- genomic location of anchor B - chromosome
- genomic location of anchor B - start coordinate
- genomic location of anchor B - end coordinate
- PET count between two anchor regions
- total PET count in anchor A (left anchor region)
- total PET count in anchor B (right anchor region)
-log10(1 - posterior probability)
- False Discovery Rate (FDR) of interaction
Citation
High resolution discovery of chromatin interactions
Yuchun Guo, Konstantin Krismer, Michael Closser, Hynek Wichterle, David K Gifford
bioRxiv 376194; doi:
https://doi.org/10.1101/376194
Contact
Post your questions, problems, or suggestions on our
GEM3 GitHub page by creating a "New Issue".