I found a bug!
Please email the maintainer of CoNIFER at nkrummNO_%_SPAMgmail.com (replace the NO_%_SPAM with @).
How should the number of SVD components to remove be selected?
This is an important parameter for CoNIFER analysis. At present, there is no "automated" way to select the number of SVD components removed. However, the number of components to remove can easily be discerned from the scree plot (--plot_scree during the analysis step).
Users should run an "initial" analysis with a ballpark --svd value (between 5 and 15), and then examine the scree plot to better gauge the amount of systematic noise present in the data.
In general, a greater number of SVD components should be removed when (a) multiple capture types, sequencing runs or DNA sources are combined and (b) a larger number of samples is included.
Please note that you can maximally remove n-1 components, where n is the number of samples in the analysis! Furthermore, removing the maximum number of components will remove virtually all signal. For small datasets, we urge close examination of the scree plot, and trying multiple differen --svd values.
How many exomes does CoNIFER analysis require?
We reccommend a minimum of eight exomes, but more exomes will nearly always improve the quality of the results. Depending on the amount of systematic noise and bias,
including 20 exomes can significantly improve the results of the analysis.
Is it possible to "mix" exomes from different sequences and different capture platforms?
Yes! CoNIFER can robustly discover latent patterns of systematic noise present between different sequencing and capture reactions. However, care should be taken that at least 8 exomes frome each "batch" are included, such that this systematic noise becomes statistically apparent.
How was the probes.txt file created?
We began by downloading the Nimblegen SeqCap EZ Exome v2.0 target file, and intersected this with the Refseq exon list from the UCSC Genome Browser. For targets overlapping exons, we used the exon boundary, unless this boundary was <10bp, in which case we expanded the probes to include at minimum 10bp of sequence. For non-exonic targets, no modifications were made.
I ran CoNIFER and all I got was a couple of lousy calls!
Although this could be due to a number of reasons, a few things to check include:
- Ensure you are selecting the right number of SVD components to remove. Examine the scree plot (--scree_plot) and select the number of components based on the inflection point in the graph. See the FAQ Item "How do I select the number of components to remove?"
- If using CoNIFER's built in caller, decrease the --threshold option to increase sensitivity. Note that this will come at a loss of specificity
- Use sequence alignments that were created using alignment software supporting the mapping of multiple locations per read, such as mrsFAST
- Export the SVD-ZRPKM data using the "export" command and use a more sensitive segmentation algorithm, such as DNAcopy