GEM software downloads

Back to GEM manual page.

Sign up for GPS mailing list to receive emails related to GEM updates, release, etc. We are actively improving our software.

latest version of GEM (version 3.4)

What's new

  • CID update: improved help message, loading bedpe.gz file; bug fixes

ChIP-Seq default read distribution file
ChIP-exo default read distribution file
Branch-seq default read distribution file
CLIP-seq default read distribution file

Genome chrom.sizes Files: hg38 hg19 hg18 mm10 mm9 mm8 saccer2

Test ChIP-Seq data: from Ng lab (PMID: 18555785), consists of Bowtie alignments of mouse ES cell CTCF ChIP-seq and GFP control reads in BED files (size:80MB).
Human CTCF (chr1) multi-condition data (size:35MB)

Synthetic joint event data in GPS paper (size:353MB), run GEM with command line options for GPS results:
java -Xmx3G -jar gem.jar --g fakeGenome.info --s 2100011010 --d Read_Distribution_default.txt --expt XXX_ipFile --ctrl ctrlFile --f BED --a 6 --icr 1 --out XXX, the --icr option is to set the correct IP/CTRL ratio.

Version history:

  • version 3.4: Released: 08-15-2018.
    • CID update: improved help message, loading bedpe.gz file
  • version 3.3: Released: 06-24-2018.
    • CID, a new method for calling chromatin interactions from ChIA-PET and HiChIP data.
  • version 3.2: Released: 05-06-2018.
    • The KSM paper has been published in Genome Research.
    • Fixed a bug in loading negative sequence for KMAC.
    • --simple option for KSM motif scanning to reduce output file size.
  • version 3.1: Released: 12-19-2017.
    • Fine tuned some parameters for KSM/KMAC.
    • Fixed a minor bug in parsing certain special file names.
  • version 3.0: Released: 04-27-2017.
    • A new k-mer based motif model and motif discovery/scanning method, KSM/KMAC.
    • The KSM paper has been accepted by RECOMB 2017. It is in bioRxiv now.
    • Starting with v3.0, GEM uses the new KMAC/KSM method for the motif discovery and motif scanning steps.
  • version 2.7: Released: 02-26-2016.
    • A new method, RMD, to merge GEM binding calls for multiple TFs into co-binding regoins. Then probabilistic topic modeling can be applied to learn the combinatorial binding pattern of TFs.
    • A new option to compute both Binomial and Possion p-value of binding event using local neighborhood control data, --local_control
  • Version 2.6: Released: 5-5-2015.
    • GEM-BP: branch point event calling for Branch-seq data, with option --bp.
    • Single-strand binding event calling and motif discovery, with option --strand_type 1. The single-strand mode is more appropriate for RNA-based sequencing data (e.g. Branch-seq, CLIP-seq). For event calling, reads from each strand are analyzed separately. For motif discovery, GEM considers only the sequence on the event strand, but not the sequence on the reverse compliment strand.
    • Set motif-based positional prior using multiple top ranking motifs, with option --pp_nmotifs [n].
    • Use PWM motifs to set the motif-based positional prior, with option --pp_pwm.
  • Version 2.5: Released: 9-2-2014.
    • Options to output motif PFM in formats such as JASPAR, MEME or HOMER.
    • Output all results in one folder.
    • Fix file loading error when --g is not used.
  • Version 2.4.1: Released: 2-11-2014.
    • Fix the data loading bug in v2.4.
  • Version 2.4: Released: 11-8-2013. This version contains a bug in data loading.
    • Support paired-end SAM/BAM format (using the same --f SAM option). It treats each mate-pair as two single-end reads.
  • Version 2.3: Released: 8-30-2013.
    • Fix the "duplicate lines" bug.
  • Version 2.2: Released: 7-11-2013.
    • Relax the criterion in selecting candidate enriched regions.
    • Process only some of the candidate regions to estimate the read distribution in round 0.
    • Fix the --outBED bug.
    • Bug: some of the binding events were output in duplicate lines in the event result files. It has been fixed in the 2.3 release.
  • Version 2.1: Released: 6-10-2013.
    • Fix a bug on data loading when using GPS_ReadDistribution to generate read distribution (it does not affect GEM/GPS).
    • A bug was found when using --outBED option. It has been fixed in the 2.2 release.
  • Version 2.0: Released: 5-17-2013.
    • Add a noise component in the mixture model for non-specific binding reads. This leads to significant improvement in spatial accuracy. This is the default option (--nd 1). To get the behavior of previous v1.x versions, use option (--nd 0).
    • Support narrowPeak output format (--outNP)
  • Version 1.3: Released: 1-28-2013.
    • Check chrom length when loading sequences
    • Update SAMTools library
  • Version 1.2: Released: 12-21-2012.
    • Option --k_neg_dinu_shuffle to generate negative sequences for motif discovery by di-nucleotide shuffling
    • Change the names of KSM and PFM files
    • Add some warning messages
  • Version 1.1: Released: 09-14-2012.
    • Fix some inconsistency with command-line options
    • Add some warning messages
  • Version 1.0: Released: 08-27-2012.
    • GEM paper is published
    • Improve on selecting correct seed k-mer to start motif finding
    • Option to use shuffled sequences as negative set in motif finding
    • Update KSM (k-mer set motif) file format
  • Version 0.9: Released: 03-19-2012.
    • Initial release of GEM.
    • Integrative ChIP-Seq binding event finding and motif discovery.




Back to GEM manual page.