We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
[GATK 4.0.1.2] No non-zero singular values were found in creating a panel of normals for somatic CNV

Hello,
I got an exception below in creating a PoN for somatic CNV on about 80 WGS samples. It consists of ~30 males and ~50 females. I was able to create PoNs successfully for each sex separately, but not all together. Sex chromosomes were excluded for the PoN creation for both sexes.
The exception message suggests to set minimum-interval-median-percentile
to a higher value. I wonder how higher it should be or what it means. Would you also help me understand the argument, --number-of-eigensamples
?
Thanks!
12:14:03.269 WARN HDF5SVDReadCountPanelOfNormals - Exception encountered during creation of panel of normals (org.broadinstitute.hellbender.exceptions.UserException: No non-zero singular values were found. It may be necessary to use stricter parameters for filtering. For example, use a larger value of minimum-interval-median-percentile.). Attempting to delete partial output in cromwell-executions/CNVSomaticPanelWorkflow/0c26e635-0641-4791-9769-adaa5cee0e87/call-CreateReadCountPanelOfNormals/execution/gatk_somatic_wgs.pon.hdf5... 18/02/18 12:14:03 INFO SparkUI: Stopped Spark web UI at http://192.168.1.12:4040 18/02/18 12:14:03 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 18/02/18 12:14:03 INFO MemoryStore: MemoryStore cleared 18/02/18 12:14:03 INFO BlockManager: BlockManager stopped 18/02/18 12:14:03 INFO BlockManagerMaster: BlockManagerMaster stopped 18/02/18 12:14:03 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 18/02/18 12:14:03 INFO SparkContext: Successfully stopped SparkContext 12:14:03.373 INFO CreateReadCountPanelOfNormals - Shutting down engine [February 18, 2018 12:14:03 PM CST] org.broadinstitute.hellbender.tools.copynumber.CreateReadCountPanelOfNormals done. Elapsed time: 3.64 minutes. Runtime.totalMemory()=16482041856 org.broadinstitute.hellbender.exceptions.GATKException: Could not create panel of normals. It may be necessary to use stricter parameters for filtering. For example, use a larger value of minimum-interval-median-percentile. at org.broadinstitute.hellbender.tools.copynumber.denoising.HDF5SVDReadCountPanelOfNormals.create(HDF5SVDReadCountPanelOfNormals.java:341) at org.broadinstitute.hellbender.tools.copynumber.CreateReadCountPanelOfNormals.runPipeline(CreateReadCountPanelOfNormals.java:269) at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:30) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:136) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:153) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195) at org.broadinstitute.hellbender.Main.main(Main.java:277) Caused by: org.broadinstitute.hellbender.exceptions.UserException: No non-zero singular values were found. It may be necessary to use stricter parameters for filtering. For example, use a larger value of minimum-interval-median-percentile. at org.broadinstitute.hellbender.tools.copynumber.denoising.HDF5SVDReadCountPanelOfNormals.create(HDF5SVDReadCountPanelOfNormals.java:317) ... 8 more
Best Answer
-
slee ✭✭✭
Hi @dayzcool,
That message indicates that the SVD did not converge for some reason. Most often, this is because your standardized counts data (samples x bins) does not look sufficiently like a single, nice point cloud---there may be outliers or many bins with zero/low counts that need to be filtered. You may need to adjust the filtering parameters accordingly, and one good one to start with is
minimum-interval-median-percentile
(see the tool documentation for a detailed description of this parameter).The parameter
number-of-eigensamples
determines the number of principal components you want to retain in the panel when performing SVD. For example, even if I have 1000 normal samples in my PoN, it's unlikely that I will want to retain all 1000 principal components---most of these will simply arise from statistical noise, rather than systematic noise. So I might want to keep and store, say, only 20, which makes the SVD much cheaper to compute and the size of the PoN much smaller.Later on, when using the PoN as input to the DenoiseReadCounts tool, I can again specify
number-of-eigensamples
to decide how many of these 20 principal components I actually want to use to denoise a case sample. Looking at the singular values (which are stored in the PoN and can be viewed using, e.g., hdfview) and identifying the "elbow" in the usual manner may be a good way to decide this number. Using more components to denoise can improve your segmentation, but may come at the expense of sensitivity---you may inadvertently denoise away signal! So a sensitivity analysis should ideally be performed to choose this parameter for your particular study.It's curious that your SVD only converged when you increased
number-of-eigensamples
. I would suspect that you may still have issues in your data that need to be addressed with filtering. As always, it's a good idea to plot your data (which can be done with PlotDenoisedCopyRatios) to see if it looks reasonable.
Answers
Just an update, the exception doesn't occur if
number-of-eigensamples
is set to40
(default is20
).It would be great to know how a good value for
number-of-eigensamples
could be chosen. (It may be a wrong resolution to the original issue though.)@dayzcool
Hi,
I will ask someone from the team to get back to you.
-Sheila
Hi @dayzcool,
That message indicates that the SVD did not converge for some reason. Most often, this is because your standardized counts data (samples x bins) does not look sufficiently like a single, nice point cloud---there may be outliers or many bins with zero/low counts that need to be filtered. You may need to adjust the filtering parameters accordingly, and one good one to start with is
minimum-interval-median-percentile
(see the tool documentation for a detailed description of this parameter).The parameter
number-of-eigensamples
determines the number of principal components you want to retain in the panel when performing SVD. For example, even if I have 1000 normal samples in my PoN, it's unlikely that I will want to retain all 1000 principal components---most of these will simply arise from statistical noise, rather than systematic noise. So I might want to keep and store, say, only 20, which makes the SVD much cheaper to compute and the size of the PoN much smaller.Later on, when using the PoN as input to the DenoiseReadCounts tool, I can again specify
number-of-eigensamples
to decide how many of these 20 principal components I actually want to use to denoise a case sample. Looking at the singular values (which are stored in the PoN and can be viewed using, e.g., hdfview) and identifying the "elbow" in the usual manner may be a good way to decide this number. Using more components to denoise can improve your segmentation, but may come at the expense of sensitivity---you may inadvertently denoise away signal! So a sensitivity analysis should ideally be performed to choose this parameter for your particular study.It's curious that your SVD only converged when you increased
number-of-eigensamples
. I would suspect that you may still have issues in your data that need to be addressed with filtering. As always, it's a good idea to plot your data (which can be done with PlotDenoisedCopyRatios) to see if it looks reasonable.Thanks a lot for the guidance, @slee! I'll follow your advice.
As for
minimum-interval-median-percentile
, forgive my ignorance, but I couldn't clearly understand the description. How is fractional coverage computed?Fractional coverage is simply the coverage in each bin normalized by the sum of coverage in all bins. (This was referred to as "proportional coverage" in older documentation.)
@slee, I'd like to ask your help again. Although all your suggestions including HDFView were super helpful, I couldn't figure out what problems my normals have unfortunately.
minimum-interval-median-percentile
didn't help up to 50 (default 10). And, I could not recognize a significant problem that differentiates normals in the failed run compared to normals in a successful run, though PlotDenoisedCopyRatios was very helpful in finding normals with CNV in the panel (e.g. Down syndrome).Also, I mislead you on the PoN creation with the
number-of-eigensamples
of 40. I realized that it was not successful, either. I thought it succeeded because the return code of the process is 0, but the PoN doesn't look like a valid hdf5 file.Is there anything else I can do to troubleshot?
Below are couple issues I had that might be helpful for you:
1) PoN creation seems successful since the process returns 0 and there is no error in stderr. However, the output is not a valid hdf5 file.
2) PlotDenoisedCopyRatios fails sometimes
@dayzcool Hmm, it's disturbing that you ended up with an invalid PoN and a 0 exit code---the tools should fail and delete the PoN being made if an error is encountered. If you can put together a minimal working example that demonstrates the bug and are willing to share your data, that would be very helpful.
Just to step back a bit: you have successfully made a PoN with some subset of your normals, correct? Have you ever been able to include a sample that fails at a later step (e.g., PlotDenoisedCopyRatios) in a successfully built PoN?
I suspect that some fraction of your samples may be poorly covered, which is leading to errors and difficulties in the SVD convergence. Can you inspect the counts files produced by CollectFragmentCounts to see whether this is the case?
@slee, I was able to identify a sample that causes the PoN creation failure thanks to your help.
As for the issue that GATK generates an invalid PoN and a 0 exit code, it seemingly could happen if netlib-java is configured not to use a native library.
Most machines I use (Centos 7, docker unavailable) don't have BLAS installed. Thus, I initially ran GATK with the JVM flag of
-Dcom.github.fommil.netlib.BLAS=com.github.fommil.netlib.F2jBLAS
to turn off the use of BLAS native library (ref. https://github.com/fommil/netlib-java). I guess some exceptions are mishandled for this. PoN creation fails properly if I run GATK without the JVM flag.Thanks again for your kind explanations!
@dayzcool, great to hear that you were able to resolve the issue! And thanks for your detailed diagnosis of the exit code issue---I'm tagging @LouisB, who may be interested.