We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Sample Naming Scheme for PostprocessGermlineCNVCalls

GATK Version - 4.1.0.0 on GATK4 Docker

I am running the PostprocessGermlineCNVCalls utility using 17 interval shards (~25k intervals each) and 8 samples.

I keep getting a key error for the sample_name. Is it an issue with the sample-index I'm using? Is my naming scheme for the sample incorrect?

Here is my log:

```
(gatk) [email protected]:/gatk# gatk PostprocessGermlineCNVCalls --calls-shard-path cnv_caller_out_rerun/201to209.shard1-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard2-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard3-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard4-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard5-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard6-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard7-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard8-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard9-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard10-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard11-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard12-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard13-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard14-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard15-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard16-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard17-calls --model-shard-path cnv_caller_out_rerun/201to209.shard1-model --model-shard-path cnv_caller_out_rerun/201to209.shard2-model/ --model-shard-path cnv_caller_out_rerun/201to209.shard3-model --model-shard-path cnv_caller_out_rerun/201to209.shard4-model --model-shard-path cnv_caller_out_rerun/201to209.shard5-model --model-shard-path cnv_caller_out_rerun/201to209.shard6-model --model-shard-path cnv_caller_out_rerun/201to209.shard7-model --model-shard-path cnv_caller_out_rerun/201to209.shard8-model --model-shard-path cnv_caller_out_rerun/201to209.shard9-model --model-shard-path cnv_caller_out_rerun/201to209.shard10-model --model-shard-path cnv_caller_out_rerun/201to209.shard11-model --model-shard-path cnv_caller_out_rerun/201to209.shard12-model --model-shard-path cnv_caller_out_rerun/201to209.shard13-model --model-shard-path cnv_caller_out_rerun/201to209.shard14-model --model-shard-path cnv_caller_out_rerun/201to209.shard15-model --model-shard-path cnv_caller_out_rerun/201to209.shard16-model --model-shard-path cnv_caller_out_rerun/201to209.shard17-model --sample-index 0 --autosomal-ref-copy-number 2 --allosomal-contig X --allosomal-contig Y --output-genotyped-intervals postprocessed_rerun/201_genotyped_intervals.vcf --output-genotyped-segments postprocessed_rerun/201_genotyped_segments.vcf --contig-ploidy-calls contig_ploidy_out_rerun/201to209-calls/SAMPLE_0
Using GATK jar /gatk/gatk-package-4.1.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.0.0-local.jar PostprocessGermlineCNVCalls --calls-shard-path cnv_caller_out_rerun/201to209.shard1-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard2-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard3-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard4-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard5-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard6-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard7-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard8-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard9-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard10-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard11-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard12-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard13-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard14-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard15-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard16-calls --calls-shard-path cnv_caller_out_rerun/201to209.shard17-calls --model-shard-path cnv_caller_out_rerun/201to209.shard1-model --model-shard-path cnv_caller_out_rerun/201to209.shard2-model/ --model-shard-path cnv_caller_out_rerun/201to209.shard3-model --model-shard-path cnv_caller_out_rerun/201to209.shard4-model --model-shard-path cnv_caller_out_rerun/201to209.shard5-model --model-shard-path cnv_caller_out_rerun/201to209.shard6-model --model-shard-path cnv_caller_out_rerun/201to209.shard7-model --model-shard-path cnv_caller_out_rerun/201to209.shard8-model --model-shard-path cnv_caller_out_rerun/201to209.shard9-model --model-shard-path cnv_caller_out_rerun/201to209.shard10-model --model-shard-path cnv_caller_out_rerun/201to209.shard11-model --model-shard-path cnv_caller_out_rerun/201to209.shard12-model --model-shard-path cnv_caller_out_rerun/201to209.shard13-model --model-shard-path cnv_caller_out_rerun/201to209.shard14-model --model-shard-path cnv_caller_out_rerun/201to209.shard15-model --model-shard-path cnv_caller_out_rerun/201to209.shard16-model --model-shard-path cnv_caller_out_rerun/201to209.shard17-model --sample-index 0 --autosomal-ref-copy-number 2 --allosomal-contig X --allosomal-contig Y --output-genotyped-intervals postprocessed_rerun/201_genotyped_intervals.vcf --output-genotyped-segments postprocessed_rerun/201_genotyped_segments.vcf --contig-ploidy-calls contig_ploidy_out_rerun/201to209-calls/SAMPLE_0
15:24:55.011 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jun 10, 2019 3:24:56 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
WARNING: Failed to detect whether we are running on Google Compute Engine.
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)
at shaded.cloud_nio.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:104)
at shaded.cloud_nio.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials.runningOnComputeEngine(ComputeEngineCredentials.java:210)
at shaded.cloud_nio.com.google.auth.oauth2.DefaultCredentialsProvider.tryGetComputeCredentials(DefaultCredentialsProvider.java:290)
at shaded.cloud_nio.com.google.auth.oauth2.DefaultCredentialsProvider.getDefaultCredentialsUnsynchronized(DefaultCredentialsProvider.java:207)
at shaded.cloud_nio.com.google.auth.oauth2.DefaultCredentialsProvider.getDefaultCredentials(DefaultCredentialsProvider.java:124)
at shaded.cloud_nio.com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(GoogleCredentials.java:127)
at shaded.cloud_nio.com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(GoogleCredentials.java:100)
at com.google.cloud.ServiceOptions.defaultCredentials(ServiceOptions.java:304)
at com.google.cloud.ServiceOptions.<init>(ServiceOptions.java:278)
at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:83)
at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:31)
at com.google.cloud.storage.StorageOptions$Builder.build(StorageOptions.java:78)
at org.broadinstitute.hellbender.utils.gcs.BucketUtils.setGlobalNIODefaultOptions(BucketUtils.java:353)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:182)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)

15:24:56.335 INFO PostprocessGermlineCNVCalls - ------------------------------------------------------------
15:24:56.337 INFO PostprocessGermlineCNVCalls - The Genome Analysis Toolkit (GATK) v4.1.0.0
15:24:56.339 INFO PostprocessGermlineCNVCalls - Executing as [email protected] on Linux v4.9.125-linuxkit amd64
15:24:56.339 INFO PostprocessGermlineCNVCalls - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12
15:24:56.340 INFO PostprocessGermlineCNVCalls - Start Date/Time: June 10, 2019 3:24:54 PM UTC
15:24:56.340 INFO PostprocessGermlineCNVCalls - ------------------------------------------------------------
15:24:56.340 INFO PostprocessGermlineCNVCalls - ------------------------------------------------------------
15:24:56.341 INFO PostprocessGermlineCNVCalls - HTSJDK Version: 2.18.2
15:24:56.341 INFO PostprocessGermlineCNVCalls - Picard Version: 2.18.25
15:24:56.341 INFO PostprocessGermlineCNVCalls - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:24:56.342 INFO PostprocessGermlineCNVCalls - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:24:56.342 INFO PostprocessGermlineCNVCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:24:56.342 INFO PostprocessGermlineCNVCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:24:56.342 INFO PostprocessGermlineCNVCalls - Deflater: IntelDeflater
15:24:56.343 INFO PostprocessGermlineCNVCalls - Inflater: IntelInflater
15:24:56.343 INFO PostprocessGermlineCNVCalls - GCS max retries/reopens: 20
15:24:56.343 INFO PostprocessGermlineCNVCalls - Requester pays: disabled
15:24:56.343 INFO PostprocessGermlineCNVCalls - Initializing engine
15:24:59.329 INFO PostprocessGermlineCNVCalls - Done initializing engine
15:25:00.758 INFO ProgressMeter - Starting traversal
15:25:00.758 INFO ProgressMeter - Current Locus Elapsed Minutes Records Processed Records/Minute
15:25:00.758 INFO PostprocessGermlineCNVCalls - Generating intervals VCF file...
15:25:00.759 WARN PostprocessGermlineCNVCalls - An variant index will not be created - a sequence dictionary is required to create an output index
15:25:00.808 INFO PostprocessGermlineCNVCalls - Analyzing shard 0...
15:25:01.863 INFO PostprocessGermlineCNVCalls - Analyzing shard 1...
15:25:02.494 INFO PostprocessGermlineCNVCalls - Analyzing shard 2...
15:25:02.938 INFO PostprocessGermlineCNVCalls - Analyzing shard 3...
15:25:03.323 INFO PostprocessGermlineCNVCalls - Analyzing shard 4...
15:25:03.687 INFO PostprocessGermlineCNVCalls - Analyzing shard 5...
15:25:04.035 INFO PostprocessGermlineCNVCalls - Analyzing shard 6...
15:25:04.392 INFO PostprocessGermlineCNVCalls - Analyzing shard 7...
15:25:04.756 INFO PostprocessGermlineCNVCalls - Analyzing shard 8...
15:25:05.103 INFO PostprocessGermlineCNVCalls - Analyzing shard 9...
15:25:05.463 INFO PostprocessGermlineCNVCalls - Analyzing shard 10...
15:25:05.839 INFO PostprocessGermlineCNVCalls - Analyzing shard 11...
15:25:06.190 INFO PostprocessGermlineCNVCalls - Analyzing shard 12...
15:25:06.541 INFO PostprocessGermlineCNVCalls - Analyzing shard 13...
15:25:06.886 INFO PostprocessGermlineCNVCalls - Analyzing shard 14...
15:25:07.238 INFO PostprocessGermlineCNVCalls - Analyzing shard 15...
15:25:07.582 INFO PostprocessGermlineCNVCalls - Analyzing shard 16...
15:25:07.972 INFO PostprocessGermlineCNVCalls - Generating segments VCF file...
15:25:25.599 INFO PostprocessGermlineCNVCalls - Shutting down engine
[June 10, 2019 3:25:25 PM UTC] org.broadinstitute.hellbender.tools.copynumber.PostprocessGermlineCNVCalls done. Elapsed time: 0.51 minutes.
Runtime.totalMemory()=262144000
org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException:
python exited with 1
Command Line: python /tmp/segment_gcnv_calls.7436399952764784144.py --ploidy_calls_path /gatk/contig_ploidy_out_rerun/201to209-calls/SAMPLE_0 --model_shards /gatk/cnv_caller_out_rerun/201to209.shard1-model /gatk/cnv_caller_out_rerun/201to209.shard2-model /gatk/cnv_caller_out_rerun/201to209.shard3-model /gatk/cnv_caller_out_rerun/201to209.shard4-model /gatk/cnv_caller_out_rerun/201to209.shard5-model /gatk/cnv_caller_out_rerun/201to209.shard6-model /gatk/cnv_caller_out_rerun/201to209.shard7-model /gatk/cnv_caller_out_rerun/201to209.shard8-model /gatk/cnv_caller_out_rerun/201to209.shard9-model /gatk/cnv_caller_out_rerun/201to209.shard10-model /gatk/cnv_caller_out_rerun/201to209.shard11-model /gatk/cnv_caller_out_rerun/201to209.shard12-model /gatk/cnv_caller_out_rerun/201to209.shard13-model /gatk/cnv_caller_out_rerun/201to209.shard14-model /gatk/cnv_caller_out_rerun/201to209.shard15-model /gatk/cnv_caller_out_rerun/201to209.shard16-model /gatk/cnv_caller_out_rerun/201to209.shard17-model --calls_shards /gatk/cnv_caller_out_rerun/201to209.shard1-calls /gatk/cnv_caller_out_rerun/201to209.shard2-calls /gatk/cnv_caller_out_rerun/201to209.shard3-calls /gatk/cnv_caller_out_rerun/201to209.shard4-calls /gatk/cnv_caller_out_rerun/201to209.shard5-calls /gatk/cnv_caller_out_rerun/201to209.shard6-calls /gatk/cnv_caller_out_rerun/201to209.shard7-calls /gatk/cnv_caller_out_rerun/201to209.shard8-calls /gatk/cnv_caller_out_rerun/201to209.shard9-calls /gatk/cnv_caller_out_rerun/201to209.shard10-calls /gatk/cnv_caller_out_rerun/201to209.shard11-calls /gatk/cnv_caller_out_rerun/201to209.shard12-calls /gatk/cnv_caller_out_rerun/201to209.shard13-calls /gatk/cnv_caller_out_rerun/201to209.shard14-calls /gatk/cnv_caller_out_rerun/201to209.shard15-calls /gatk/cnv_caller_out_rerun/201to209.shard16-calls /gatk/cnv_caller_out_rerun/201to209.shard17-calls --output_path /tmp/gcnv-segmented-calls3855848897768528826 --sample_index 0
Stdout: 15:25:10.508 INFO segment_gcnv_calls - Loading ploidy calls...
15:25:10.509 INFO gcnvkernel.io.io_metadata - Loading germline contig ploidy and global read depth metadata...
15:25:10.509 INFO segment_gcnv_calls - Instantiating the Viterbi segmentation engine...
15:25:10.524 INFO gcnvkernel.postprocess.viterbi_segmentation - Assembling interval list and copy-number class posterior from model shards...
15:25:15.496 INFO gcnvkernel.structs.metadata - Generating intervals metadata...
15:25:17.745 INFO gcnvkernel.postprocess.viterbi_segmentation - Compiling theano forward-backward function...
15:25:20.848 INFO gcnvkernel.postprocess.viterbi_segmentation - Compiling theano Viterbi function...
15:25:23.831 INFO gcnvkernel.postprocess.viterbi_segmentation - Compiling theano variational HHMM...
15:25:24.222 INFO gcnvkernel.postprocess.viterbi_segmentation - Processing sample index: 0, sample name: 201-Exp29_S97...
15:25:24.828 INFO gcnvkernel.postprocess.viterbi_segmentation - Segmenting contig (1/23) (contig name: 1)...

Stderr: Traceback (most recent call last):
File "/tmp/segment_gcnv_calls.7436399952764784144.py", line 73, in <module>
viterbi_engine.write_copy_number_segments()
File "/opt/miniconda/envs/gatk/lib/python3.6/site-packages/gcnvkernel/postprocess/viterbi_segmentation.py", line 234, in write_copy_number_segments
for segment in self._viterbi_segments_generator():
File "/opt/miniconda/envs/gatk/lib/python3.6/site-packages/gcnvkernel/postprocess/viterbi_segmentation.py", line 132, in _viterbi_segments_generator
.get_sample_ploidy_metadata(sample_name)\
File "/opt/miniconda/envs/gatk/lib/python3.6/site-packages/gcnvkernel/structs/metadata.py", line 278, in get_sample_ploidy_metadata
return self.sample_ploidy_metadata_dict[sample_name]
KeyError: '201-Exp29_S97'

at org.broadinstitute.hellbender.utils.python.PythonExecutorBase.getScriptException(PythonExecutorBase.java:75)
at org.broadinstitute.hellbender.utils.runtime.ScriptExecutor.executeCuratedArgs(ScriptExecutor.java:126)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeArgs(PythonScriptExecutor.java:170)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeScript(PythonScriptExecutor.java:151)
at org.broadinstitute.hellbender.utils.python.PythonScriptExecutor.executeScript(PythonScriptExecutor.java:121)
at org.broadinstitute.hellbender.tools.copynumber.PostprocessGermlineCNVCalls.executeSegmentGermlineCNVCallsPythonScript(PostprocessGermlineCNVCalls.java:499)
at org.broadinstitute.hellbender.tools.copynumber.PostprocessGermlineCNVCalls.generateSegmentsVCFFileFromAllShards(PostprocessGermlineCNVCalls.java:435)
at org.broadinstitute.hellbender.tools.copynumber.PostprocessGermlineCNVCalls.traverse(PostprocessGermlineCNVCalls.java:296)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
```

Best Answer

  • sleeslee ✭✭✭
    Accepted Answer

    Hi @ngerald,

    I think if you specify --contig-ploidy-calls contig_ploidy_out_rerun/201to209-calls/ rather than --contig-ploidy-calls contig_ploidy_out_rerun/201to209-calls/SAMPLE_0, you should be in business.

    I realize that it can be a little confusing if you are trying to follow the WDL, in which we perform some manipulations of the sample-by-shards matrix of directories. These manipulations avoid unnecessary file transfers on the cloud, but aren't needed when running locally. Both the tool documentation and/or gCNV tutorial should give examples of correct usage that work both on the cloud and locally.

    Thanks,
    Samuel

Answers

  • sleeslee Member, Broadie, Dev ✭✭✭
    Accepted Answer

    Hi @ngerald,

    I think if you specify --contig-ploidy-calls contig_ploidy_out_rerun/201to209-calls/ rather than --contig-ploidy-calls contig_ploidy_out_rerun/201to209-calls/SAMPLE_0, you should be in business.

    I realize that it can be a little confusing if you are trying to follow the WDL, in which we perform some manipulations of the sample-by-shards matrix of directories. These manipulations avoid unnecessary file transfers on the cloud, but aren't needed when running locally. Both the tool documentation and/or gCNV tutorial should give examples of correct usage that work both on the cloud and locally.

    Thanks,
    Samuel

  • ngeraldngerald Member
    Worked like a charm!
    Thanks!
Sign In or Register to comment.