CID - Chromatin Interaction Discovery for ChIA-PET and HiChIP data

ChIA-PET and HiChIP pipelines

Installation

CID: Test installation by running: java -Xmx32G -jar gem.jar CID -h

MICC: Test installation by running the following line in R: library(MICC)

CID usage

java -Xmx10G -jar gem.jar CID --h

CID - Chromatin Interaction Discovery, version 1.0

Usage: java -Xmx32G -jar gem.jar CID --data INPUT-BEDPE-FILE --g CHROMOSOME-SIZE-FILE [--micc NUM] [--ex CHR] [--out FILE-NAME-PREFIX]
Options:
        --data INPUT-BEDPE-FILE
                the file path to the aligned paired-end bedpe or bedpe.gz file

        --g CHROMOSOME-SIZE-FILE
                the file path to the chromosome size file

        [--micc NUM]
                the minimum PET count of candidate interactions for the MICC input BEDPE file. Default NUM = 1. For large datasets, try NUM = 2, 3, or 4.

        [--ex CHR]
                the list of chromosomes to exclude. Default CHR = M. Multiple chromosomes can be separated by commas, e.g., CHR = M,Y
        
        [--out FILE-NAME-PREFIX]
                the output file name prefix
                
        [--h]
                Show this help message and exit

Example

CID output

CID generates a BEDPE file containing identified genomic interactions (*.bedpe).

BEDPE is a header-less tab-delimited text file, following the BEDPE format, with the following columns:
  1. genomic location of anchor A - chromosome
  2. genomic location of anchor A - start coordinate
  3. genomic location of anchor A - end coordinate
  4. genomic location of anchor B - chromosome
  5. genomic location of anchor B - start coordinate
  6. genomic location of anchor B - end coordinate
  7. PET count between two anchor regions
  8. total PET count in anchor A (left anchor region)
  9. total PET count in anchor B (right anchor region)

MICC output

MICC creates one output file (in the example above, test-run-micc-out.txt). It is a tab-delimited text file with the following columns:
  1. genomic location of anchor A - chromosome
  2. genomic location of anchor A - start coordinate
  3. genomic location of anchor A - end coordinate
  4. genomic location of anchor B - chromosome
  5. genomic location of anchor B - start coordinate
  6. genomic location of anchor B - end coordinate
  7. PET count between two anchor regions
  8. total PET count in anchor A (left anchor region)
  9. total PET count in anchor B (right anchor region)
  10. -log10(1 - posterior probability)
  11. False Discovery Rate (FDR) of interaction

Citation

High resolution discovery of chromatin interactions
Yuchun Guo, Konstantin Krismer, Michael Closser, Hynek Wichterle, David K Gifford
bioRxiv 376194; doi: https://doi.org/10.1101/376194

Contact

Post your questions, problems, or suggestions on our GEM3 GitHub page by creating a "New Issue".