We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

infinite waiting for "GetPileupSummaries".

KKNDKKND Member
edited November 2019 in Ask the GATK team

`
(base) [email protected]:data# gatk GetPileupSummaries -I normal_recal.bam -O normal_pileups.table -V af-only-gnomad.hg38_reduced.vcf.gz -L af-only-gnomad.hg38_reduced.vcf.gz

Using GATK jar /data/prosium/01.program/07.GATK/4-1-2-0/gatk-package-4.1.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=fa lse -Dsamjdk.compression_level=2 -jar /data/prosium/01.program/07.GATK/4-1-2-0/gatk-package-4.1.2.0-local.jar GetPileupSummaries -- tmp-dir /data/prosium/02.BENCHMARK/ -I /data/prosium/02.BENCHMARK/01.MGI_ILLUMINA/01.MGI/02.result/03.BQSR/NCC_151N_recal.bam -O NC C_151N_pileups.table -V /data/prosium/00.REFERECE/50.Known_indels/af-only-gnomad.hg38_reduced.vcf.gz -L /data/prosium/00.REFERECE/5 0.Known_indels/af-only-gnomad.hg38_reduced.vcf.gz
13:49:17.454 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data/prosium/01.program/07.GATK/4-1-2-0/gatk- package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Nov 15, 2019 1:49:19 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
13:49:19.227 INFO GetPileupSummaries - ------------------------------------------------------------
13:49:19.228 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.1.2.0
13:49:19.228 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
13:49:19.228 INFO GetPileupSummaries - Executing as [email protected] on Linux v4.10.0-28-generic amd64
13:49:19.229 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_222-8u222-b10-1ubuntu1~16.04.1-b10
13:49:19.229 INFO GetPileupSummaries - Start Date/Time: November 15, 2019 1:49:17 PM KST
13:49:19.229 INFO GetPileupSummaries - ------------------------------------------------------------
13:49:19.229 INFO GetPileupSummaries - ------------------------------------------------------------
13:49:19.230 INFO GetPileupSummaries - HTSJDK Version: 2.19.0
13:49:19.230 INFO GetPileupSummaries - Picard Version: 2.19.0
13:49:19.230 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:49:19.230 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:49:19.230 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:49:19.231 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:49:19.231 INFO GetPileupSummaries - Deflater: IntelDeflater
13:49:19.231 INFO GetPileupSummaries - Inflater: IntelInflater
13:49:19.231 INFO GetPileupSummaries - GCS max retries/reopens: 20
13:49:19.231 INFO GetPileupSummaries - Requester pays: disabled
13:49:19.231 WARN GetPileupSummaries -

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Warning: GetPileupSummaries is a BETA tool and is not yet ready for use in production

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

13:49:19.232 INFO GetPileupSummaries - Initializing engine
13:49:19.685 INFO FeatureManager - Using codec VCFCodec to read file file:///data/prosium/00.REFERECE/50.Known_indels/af-only-gnom ad.hg38_reduced.vcf.gz
13:49:19.857 INFO FeatureManager - Using codec VCFCodec to read file file:///data/prosium/00.REFERECE/50.Known_indels/af-only-gnom ad.hg38_reduced.vcf.gz
14:31:24.679 INFO IntervalArgumentCollection - Processing 326449810 bp from intervals
14:36:47.705 INFO GetPileupSummaries - Done initializing engine
14:36:47.706 INFO ProgressMeter - Starting traversal
14:36:47.707 INFO ProgressMeter - Current Locus Elapsed Minutes Loci Processed Loci/Minute

`

Waiting for 4 hours at this stage doesn't seem to go any further.
Is there a problem with the input file?
No other error message was found at the moment.

'reduced vcf' means exclude others but chr1-22 +XYM

Post edited by KKND on

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited November 2019

    Hi @KKND

    1. What are the specs of the machine you are using?
    2. Try to specify memory allocation and see if that resolves the issue. Take a look at this doc: https://software.broadinstitute.org/gatk/documentation/article?id=11050
  • KKNDKKND Member

    below is my server spec.
    Intel Xeon *2 (80cores)
    755GB RAM
    800TB HDD

    Getpileupsummary allocate 20GB RAM and 4000% CPU usage

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @KKND

    The gnomad interval list is very large and which is why it is taking this long. There is no log output because it is getting stuck at parsing the intervals
    Here are some things you can try:
    1. Try to first use no intervals to do a sanity check. If this works then you know that the reason it was stuck was because of the large interval list.
    2. If the interval list was in fact the problem then the solution is to give it a different interval list. Another way to speak up the process is to split across each chromosome across different machines.

Sign In or Register to comment.