CoNIFER Quick Start:

This is document describes how to do a basic CoNIFER analysis using the available test data set of 26 exomes. A general tutorial for using CoNIFER with your own data can be found in the tutorial. This quick start guide assumes that you have Python (version 2.7 or greater), as well as the NumPy, Pytables and matplotlib libraries installed.

StudyExome CaptureNSamples
HapMapNimblegen V1 (2009)8NA12878, NA15510, NA18507, NA18517, NA18555, NA18956, NA19129, NA19240
Autism TriosNimblegen V2 (2010)186 Trios (Mother, Father, Proband)
(Also included is a probe definition file for RefSeq exons and additional Nimblegen targets)

1. Download the latest CoNIFER version from downloads and extract the contents of the CoNIFER package to a local directory:
$ tar -xzf conifer_v0.2.tar.gz

2. Download the sample RPKM data set from here, and extract the contents into a directory within the CoNIFER path:
$ tar -xzf sampledata.tar.gz -C conifer_v0.2/

3. Run CoNIFER analysis step
From the shell, cd to the directory where you extracted CoNIFER.
$ cd conifer_v0.2
Then run:
$ python conifer.py analyze \
	--probes sampledata/probes.txt \
	--rpkm_dir sampledata/RPKM_data/ \
	--output analysis.hdf5 \
	--svd 6 \
	--write_svals singular_values.txt
This step carries out the CoNIFER algorithm on the RPKM files in the --rpkm_dir option. Two new files are created: the analysis.hdf5, which is a HDF5 file containig all the SVD-ZRPKM values for all samples in the experiment, as well as the singular_values.txt file containing the singular values from the SVD transformation (These can be plotted as a "scree plot")

4. Make and Plot calls
$ python conifer.py call \
	--input analysis.hdf5 \
	--output calls.txt

This step uses a basic thresholding algorithm to find rare CNVs, and the genomic coordinates are listed in calls.txt.
Next, to visualize these calls and the underlying data, we will use the plotcalls command to create snapshot views of each region. Make a new directory for the images:
$ mkdir call_images

Then run conifer.py plotcalls:
$ python conifer.py plotcalls \
	--input analysis.hdf5 \
	--calls calls.txt \
	--outputdir ./call_images/