Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Problem with CrosscheckFingerprints when using NIO

ruslanafrazerruslanafrazer Member, Broadie
edited May 2018 in Ask the GATK team

I'm trying to run the CrosscheckFingerprints tool in the gatk-4.0.4.0 package without copying my BAM files into the VM, but trying to access the file directly on gcs.
The task starts running and I even get this initial output (on stderr):

/usr/local/jre1.8.0_73/bin/java -Xmx14g -jar /gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar CrosscheckFingerprints \
-I gs://fc-66b68450-9d98-490f-934f-a9d824aac4be/REBC-AC8T-TTP1-A-1-1-D-A553-36/REBC-AC8T-TTP1-A-1-1-D-A553-36.bam \
-I gs://fc-66b68450-9d98-490f-934f-a9d824aac4be/REBC-AC8T-NB1-A-1-0-D-A525-36/REBC-AC8T-NB1-A-1-0-D-A525-36.bam \
-H /cromwell_root/firecloud-tcga-open-access/tutorial/reference/Homo_sapiens_assembly19.haplotype_database.txt \
--QUIET false --EXIT_CODE_WHEN_MISMATCH 0 \
--OUTPUT crosscheck.stats.txt \
--VALIDATION_STRINGENCY LENIENT
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/fc-4aacc2d6-4017-4fc2-b95a-fb892a3562b9/70dc8591-32c2-4de1-b2f8-2af887f2a8e3/Clinical_Workflow/6794a873-5bbb-46ef-ae2c-a10e77dab94e/call-CrossCheckLaneFingerprints_Task/attempt-2/tmp.197aeb50
17:12:16.387 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu May 17 17:12:16 UTC 2018] CrosscheckFingerprints  --INPUT gs://fc-66b68450-9d98-490f-934f-a9d824aac4be/REBC-AC8T-TTP1-A-1-1-D-A553-36/REBC-AC8T-TTP1-A-1-1-D-A553-36.bam --INPUT gs://fc-66b68450-9d98-490f-934f-a9d824aac4be/REBC-AC8T-NB1-A-1-0-D-A525-36/REBC-AC8T-NB1-A-1-0-D-A525-36.bam --OUTPUT crosscheck.stats.txt --HAPLOTYPE_MAP /cromwell_root/firecloud-tcga-open-access/tutorial/reference/Homo_sapiens_assembly19.haplotype_database.txt --EXIT_CODE_WHEN_MISMATCH 0 --QUIET false --VALIDATION_STRINGENCY LENIENT  --CROSSCHECK_MODE CHECK_SAME_SAMPLE --LOD_THRESHOLD 0.0 --CROSSCHECK_BY READGROUP --NUM_THREADS 1 --CALCULATE_TUMOR_AWARE_RESULTS true --ALLOW_DUPLICATE_READS false --GENOTYPING_ERROR_RATE 0.01 --OUTPUT_ERRORS_ONLY false --LOSS_OF_HET_RATE 0.5 --EXPECT_ALL_GROUPS_TO_MATCH false --VERBOSITY INFO --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Thu May 17 17:12:16 UTC 2018] Executing as [email protected] on Linux 4.9.0-0.bpo.6-amd64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_73-b02; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.0.4.0
INFO    2018-05-17 17:12:18 CrosscheckFingerprints  Fingerprinting 2 INPUT files.

But then the task just gets stuck without printing out anything else.
Yesterday I ran the task for 20 hours and had to abort it because it never finished. If I copy the file into the VM the whole process takes a couple of hours, including the time it takes to copy the files.

What could be the reason for this?

Thanks!
Ruslana

Post edited by shlee on

Issue · Github
by Sheila

Issue Number
3096
State
closed
Last Updated
Assignee
Array
Closed By
sooheelee

Answers

Sign In or Register to comment.