JBD needs two types of input files: the data files and the influence functions. We keep all the files for an experiment in one directory and use the directory name as the base name for the files. For example, our sc_gcn4 directory contains the files
sc_gcn4.1.1.data
sc_gcn4.1.coeffs
sc_gcn4.10.1.data
sc_gcn4.11.1.data
sc_gcn4.12.1.data
sc_gcn4.13.1.data
sc_gcn4.14.1.data
sc_gcn4.15.1.data
sc_gcn4.16.1.data
and so forth.

Data Files

The data files use a six column format:

  1. The chromosomal position of the observation (an interger, usually the center of the probe's position)
  2. The IP value (float)
  3. The WCE value (float)
  4. The IP/WCE ratio (float)
  5. The variance of the ratios at this position
  6. The replicate number of this observation

We generally create one input file per chromosome and encode the chromosome in the filename. For example, the first few lines of the file sc_gcn4.1.1.data look like

90 10418.2148 1115.01746 9.34354401 2.8973599094445 3
90 15248.0146 1001.49152 15.2253056 2.8973599094445 2
90 8676.93066 981.000671 8.84497929 2.8973599094445 1
2292 1334.40527 1491.9679 .894392729 0.0543957084890036 2
2292 603.666809 611.041016 .987931728 0.0543957084890036 1
2292 687.67572 800.566467 .858986437 0.0543957084890036 3
2887 916.858093 954.801514 .960260391 0.0522224998772054 2
2887 318.739502 370.677338 .859883964 0.0522224998772054 1
2887 357.207123 364.964905 .978743732 0.0522224998772054 3
3039 221.608139 237.878128 .93160367 0.028031123766514 1
There are three replicates in this file (note the three observations at each position). JBD does not require the number of replicates to be the same at every position, so it can handle missing data correctly.

Influence Functions

You'll need to have one file for each replicate that describes the influence function for that replicate. The filename should include the basename, the replicate number, and .coeffs, eg sc_gcn4.1.coeffs. The replicate numbers in these filenames correspond to the replicate numbers in the last column of the data file. The influence function is specified in two columns:
  1. The distance from the binding site, starting with zero
  2. The relative ratio expected.

For example, the first few lines of sc_gcn4.1.coeffs look like

0       1
1       1
2       .999
3       .999
4       .999
5       .999
6       .999
7       .999
8       .999
9       .999
The last line of the file should have a relative ratio of zero.