Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

GenotypeGVCFs --useNewAFCalculator -G Standard -G AS_Standard crashing with java.lang.NullPointerExc

mmokrejsmmokrejs Czech RepublicMember

Hi,
when running GenomeAnalysisTK.jar -T GenotypeGVCFs -nt 16 -R hs38DH.fa --dbsnp 00-All.vcf.gz --useNewAFCalculator -G Standard -G AS_Standard -o mysamples.AS.raw.vcf --variant mysample.HaplotypeCaller.AS.g.vcf --variant ... I get a crash.

INFO  11:25:39,749 HelpFormatter - Executing as [email protected] on Linux 2.6.32-642.15.1.el6.Bull.110.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13. 
INFO  11:25:39,750 HelpFormatter - Date/Time: 2017/05/04 11:25:39 
INFO  11:25:39,750 HelpFormatter - -------------------------------------------------------------------------------------------- 
INFO  11:25:39,750 HelpFormatter - -------------------------------------------------------------------------------------------- 
INFO  11:25:40,470 GenomeAnalysisEngine - Strictness is SILENT 
INFO  11:25:42,052 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
WARN  11:25:49,958 IndexDictionaryUtils - Track dbsnp doesn't have a sequence dictionary built in, skipping dictionary validation 
INFO  11:25:49,968 MicroScheduler - Running the GATK in parallel mode with 16 total threads, 1 CPU thread(s) for each of 16 data thread(s), of 16 processors available on this machine 
INFO  11:25:51,537 GenomeAnalysisEngine - Preparing for traversal 
INFO  11:25:51,544 GenomeAnalysisEngine - Done preparing for traversal 
INFO  11:25:51,545 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
INFO  11:25:51,545 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining 
INFO  11:25:51,546 ProgressMeter -        Location |     sites | elapsed |     sites | completed | runtime |   runtime 
WARN  11:25:52,073 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail. 
WARN  11:25:52,074 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail. 
INFO  11:25:52,074 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files 
WARN  11:25:55,622 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs 
##### ERROR --
##### ERROR stack trace 
java.lang.NullPointerException
        at org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.doGenotypeCalculations(HeterozygosityUtils.java:203)
        at org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.getHetCount(HeterozygosityUtils.java:223)
        at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.calculateIC(AS_InbreedingCoeff.java:158)
        at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.makeCoeffAnnotation(AS_InbreedingCoeff.java:147)
        at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.annotate(AS_InbreedingCoeff.java:139)
        at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:230)
        at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:212)
        at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:345)
        at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:304)
        at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:135)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
        at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
        at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
        at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
        at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions https://software.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Code exception (see stack trace for error itself)
##### ERROR ------------------------------------------------------------------------------------------

The input *.AS.g.vcf fiels were created by HaplotypeCaller also with --useNewAFCalculator -G Standard -G AS_Standard , see http://gatkforums.broadinstitute.org/gatk/discussion/8317/genotypegvcfs-warning-of-info-fields-not-parsing

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin
  • mmokrejsmmokrejs Czech RepublicMember

    I think this was a bug that should be fixed in the latest nightly build.

    Not in nightly-2017-05-09-geafc7cc:

    GenotypeGVCFs --useNewAFCalculator -G Standard -G AS_Standard -o mysamples.AS.raw.vcf --variant mysample.g.vcf --variant ...
    
    INFO  13:23:46,354 GenomeAnalysisEngine - Deflater: IntelDeflater 
    INFO  13:23:46,354 GenomeAnalysisEngine - Inflater: IntelInflater 
    INFO  13:23:46,355 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  13:23:47,845 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
    WARN  13:23:55,699 IndexDictionaryUtils - Track dbsnp doesn't have a sequence dictionary built in, skipping dictionary validation 
    INFO  13:23:55,708 MicroScheduler - Running the GATK in parallel mode with 16 total threads, 1 CPU thread(s) for each of 16 data thread(s), of 16 processors available on this machine 
    INFO  13:23:57,195 GenomeAnalysisEngine - Preparing for traversal 
    INFO  13:23:57,202 GenomeAnalysisEngine - Done preparing for traversal 
    INFO  13:23:57,202 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
    INFO  13:23:57,203 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining 
    INFO  13:23:57,203 ProgressMeter -        Location |     sites | elapsed |     sites | completed | runtime |   runtime 
    WARN  13:23:57,715 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB 
    genotype annotation, annotation may still fail. 
    WARN  13:23:57,716 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB 
    genotype annotation, annotation may still fail. 
    INFO  13:23:57,716 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files 
    WARN  13:24:00,652 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not GenotypeGVCFs 
    INFO  13:24:27,206 ProgressMeter -   chr1:16206518    258053.0    30.0 s     116.0 s        0.5%    99.3 m      98.8 m 
    ##### ERROR --
    ##### ERROR stack trace 
    java.lang.NullPointerException
            at org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.doGenotypeCalculations(HeterozygosityUtils.java:198)
            at org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.getHetCount(HeterozygosityUtils.java:223)
            at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.calculateIC(AS_InbreedingCoeff.java:158)
            at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.makeCoeffAnnotation(AS_InbreedingCoeff.java:147)
            at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.annotate(AS_InbreedingCoeff.java:139)
            at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:230)
            at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:212)
            at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:345)
            at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:304)
            at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:135)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
            at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
            at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
            at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A GATK RUNTIME ERROR has occurred (version nightly-2017-05-09-geafc7cc):
    
  • SheilaSheila Broad InstituteMember, Broadie admin
    edited May 2017
  • mmokrejsmmokrejs Czech RepublicMember
    edited May 2017

    The nightly build keeps requiring me to pass in --variant $file no matter I used --validateGVCF $file. Please relax this requirement.

    ##### ERROR MESSAGE: Argument with name '--variant' (-V) is missing.

    Doh, please improve the help text description for --variant and change it to "Input VCF or GVCF file". --validateGVCF does not take as an argument $file, like I thought (the help text states Validate this file as a GVCF). Wrong!

    Validation tests started but provided I re-created even the input GVCF files (initially output from 3.7 release) many times, few days ago I even created them using the last nightly snapshot build I doubt I will find an issue. The crash at

    org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.doGenotypeCalculations(HeterozygosityUtils.java:198)

    is still happening.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    @mmokrejs Please relax your attitude.

    The --variant argument is how you provide the variant callset input, regardless of other arguments used. A GVCF is a type of VCF so the text is correct.

    The --validateGVCF argument is a flag that takes a Boolean value (true by default when used), as indicated in the tool doc.

    Two more things to try: check whether the issue still occurs when you do not use multithreading, and check whether the AS_InbreedingCoefficient annotation is present in your input GVCFs.
  • mmokrejsmmokrejs Czech RepublicMember

    Still the help text for --variant could specify it is for VCF and GVCF file types.

    I am running in single thread last weeks to rule out possible issues with that.

    If the analysis that produced the file was restricted to a subset of genomic regions (for example using the -L or -XL arguments), the same intervals must be provided for validation. Otherwise, the validation tool will find positions that are not covered by records and will fail.

    Sadly I neglected this text from the manual, and now I have to re-run the ValidateVariants command because after processing all lines in a GVCF file GATK complained I had to use -L or -XL to specify input ranges (as I did during generating of the GVCF itself). Please, could GATK check the input parameters before doing lots of CPU processing instead of barfing on me at the end that I did not include -L $file.bed? Again, this could have been parsed from the GVCF header (realizing I used a BED file in previous steps). Thank you.

    check whether the AS_InbreedingCoefficient annotation is present in your input GVCFs.

    Not a single instance of it. Header attached.

  • mmokrejsmmokrejs Czech RepublicMember

    Anyway, the validation tool doe snot support this format yet (GVCF format is currently incompatible with allele validation. Not validating Alleles.).

    java -Djavaio.tmpdir=. -Xmx89g -jar /scratch/work/project/bio/GATK/GenomeAnalysisTK-nightly-2017-05-09-geafc7cc/GenomeAnalysisTK.jar -T ValidateVariants -R /scratch/work/project/bio/db/ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/hs38DH.fa --dbsnp /scratch/work/project/bio/db/ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/GATK/00-All.vcf.gz --validateGVCF --variant ./mysample-PB/realignedBAM/mysample-PB.bwa.gatk.HaplotypeCaller.AS.g.vcf -L /scratch/work/project/bio/open-8-31/S04380110__Agilent_SureSelect_Human_All_Exon_V5/hg38/S04380110_Covered.bed --interval_padding 100
    INFO  22:26:22,362 HelpFormatter - ---------------------------------------------------------------------------------------------
    INFO  22:26:22,364 HelpFormatter - The Genome Analysis Toolkit (GATK) vnightly-2017-05-09-geafc7cc, Compiled 2017/05/09 00:01:16
    INFO  22:26:22,365 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
    INFO  22:26:22,365 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
    INFO  22:26:22,365 HelpFormatter - [Sun May 14 22:26:22 CEST 2017] Executing on Linux 2.6.32-642.15.1.el6.Bull.110.x86_64 amd64
    INFO  22:26:22,365 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13
    INFO  22:26:22,369 HelpFormatter - Program Args: -T ValidateVariants -R /scratch/work/project/bio/db/ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/hs38DH.fa --dbsnp /scratch/work/project/bio/db/ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/GATK/00-All.vcf.gz --validateGVCF --variant ./mysample-PB/realignedBAM/mysample-PB.bwa.gatk.HaplotypeCaller.AS.g.vcf -L /scratch/work/project/bio/open-8-31/S04380110__Agilent_SureSelect_Human_All_Exon_V5/hg38/S04380110_Covered.bed --interval_padding 100
    INFO  22:26:22,372 HelpFormatter - Executing as [email protected] on Linux 2.6.32-642.15.1.el6.Bull.110.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13.
    INFO  22:26:22,372 HelpFormatter - Date/Time: 2017/05/14 22:26:22
    INFO  22:26:22,373 HelpFormatter - ---------------------------------------------------------------------------------------------
    INFO  22:26:22,373 HelpFormatter - ---------------------------------------------------------------------------------------------
    ERROR StatusLogger Unable to create class org.apache.logging.log4j.core.impl.Log4jContextFactory specified in jar:file:/scratch/work/project/bio/GATK/GenomeAnalysisTK-nightly-2017-05-09-geafc7cc/GenomeAnalysisTK.jar!/META-INF/log4j-provider.properties
    ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
    INFO  22:26:22,516 GenomeAnalysisEngine - Deflater: IntelDeflater
    INFO  22:26:22,516 GenomeAnalysisEngine - Inflater: IntelInflater
    INFO  22:26:22,517 GenomeAnalysisEngine - Strictness is SILENT
    INFO  22:26:24,211 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO  22:26:31,575 IntervalUtils - Processing 89394100 bp from intervals
    WARN  22:26:31,610 IndexDictionaryUtils - Track dbsnp doesn't have a sequence dictionary built in, skipping dictionary validation
    INFO  22:26:31,682 GenomeAnalysisEngine - Preparing for traversal
    INFO  22:26:31,716 GenomeAnalysisEngine - Done preparing for traversal
    INFO  22:26:31,716 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO  22:26:31,717 ProgressMeter -                 | processed |    time |    per 1M |           |   total | remaining
    INFO  22:26:31,717 ProgressMeter -        Location |     sites | elapsed |     sites | completed | runtime |   runtime
    WARN  22:26:31,718 ValidateVariants - GVCF format is currently incompatible with allele validation. Not validating Alleles.
    INFO  22:27:01,722 ProgressMeter -   chr1:37794805   2231759.0    30.0 s      13.0 s        2.5%    20.0 m      19.5 m
    INFO  22:27:31,725 ProgressMeter -  chr1:153668663   5279556.0    60.0 s      11.0 s        5.9%    16.9 m      15.9 m
    INFO  22:28:01,729 ProgressMeter -  chr1:223803516   8022801.0    90.0 s      11.0 s        9.0%    16.7 m      15.2 m
    INFO  22:28:31,736 ProgressMeter -   chr2:84694558   1.1063979E7   120.0 s      10.0 s       12.4%    16.2 m      14.2 m
    INFO  22:29:01,740 ProgressMeter -  chr2:209982448   1.4337158E7     2.5 m      10.0 s       16.0%    15.6 m      13.1 m
    INFO  22:29:31,743 ProgressMeter -   chr3:48421464   1.7039478E7     3.0 m      10.0 s       19.1%    15.7 m      12.7 m
    INFO  22:30:01,744 ProgressMeter -  chr3:161345963   2.0015692E7     3.5 m      10.0 s       22.4%    15.6 m      12.1 m
    INFO  22:30:31,746 ProgressMeter -  chr4:112584232   2.3292947E7     4.0 m      10.0 s       26.1%    15.4 m      11.4 m
    INFO  22:31:01,748 ProgressMeter -  chr5:134145867   2.6887949E7     4.5 m      10.0 s       30.1%    15.0 m      10.5 m
    INFO  22:31:31,749 ProgressMeter -   chr6:33444685   2.9839002E7     5.0 m      10.0 s       33.4%    15.0 m      10.0 m
    INFO  22:32:01,756 ProgressMeter -  chr6:167915585   3.3071599E7     5.5 m       9.0 s       37.0%    14.9 m       9.4 m
    INFO  22:32:31,758 ProgressMeter -  chr7:116257546   3.6124142E7     6.0 m       9.0 s       40.4%    14.8 m       8.8 m
    INFO  22:33:01,759 ProgressMeter -   chr8:88074803   3.9261605E7     6.5 m       9.0 s       43.9%    14.8 m       8.3 m
    INFO  22:33:31,761 ProgressMeter -  chr9:111383431   4.2529441E7     7.0 m       9.0 s       47.6%    14.7 m       7.7 m
    INFO  22:34:01,762 ProgressMeter -  chr10:46014561   4.5106866E7     7.5 m       9.0 s       50.5%    14.9 m       7.4 m
    INFO  22:34:31,780 ProgressMeter -   chr11:1450844   4.7995824E7     8.0 m      10.0 s       53.7%    14.9 m       6.9 m
    INFO  22:35:01,781 ProgressMeter -  chr11:67442685   5.0723954E7     8.5 m      10.0 s       56.7%    15.0 m       6.5 m
    INFO  22:35:31,783 ProgressMeter -  chr12:12721251   5.3630851E7     9.0 m      10.0 s       60.0%    15.0 m       6.0 m
    INFO  22:36:01,785 ProgressMeter - chr12:102480485   5.6517839E7     9.5 m      10.0 s       63.2%    15.0 m       5.5 m
    INFO  22:36:31,786 ProgressMeter - chr13:113161078   5.9476531E7    10.0 m      10.0 s       66.5%    15.0 m       5.0 m
    INFO  22:37:01,788 ProgressMeter - chr14:104883537   6.2414515E7    10.5 m      10.0 s       69.8%    15.0 m       4.5 m
    INFO  22:37:31,790 ProgressMeter -  chr15:85736252   6.5305575E7    11.0 m      10.0 s       73.1%    15.1 m       4.1 m
    INFO  22:38:01,792 ProgressMeter -  chr16:50292546   6.7772387E7    11.5 m      10.0 s       75.8%    15.2 m       3.7 m
    INFO  22:38:31,794 ProgressMeter -   chr17:7926081   7.0169931E7    12.0 m      10.0 s       78.5%    15.3 m       3.3 m
    INFO  22:39:01,796 ProgressMeter -  chr17:50660875   7.2686277E7    12.5 m      10.0 s       81.3%    15.4 m       2.9 m
    INFO  22:39:31,798 ProgressMeter -  chr18:48041232   7.531443E7    13.0 m      10.0 s       84.3%    15.4 m       2.4 m
    INFO  22:39:35,699 ValidateVariants - Reference allele is too long (155) at position chr18:79110319; skipping that record. Set --reference_window_stop >= 155
    INFO  22:40:01,805 ProgressMeter -  chr19:15650208   7.760993E7    13.5 m      10.0 s       86.8%    15.5 m       2.0 m
    INFO  22:40:31,807 ProgressMeter -  chr19:49442510   7.9926597E7    14.0 m      10.0 s       89.4%    15.7 m      99.0 s
    INFO  22:41:01,809 ProgressMeter -  chr20:47290319   8.2511445E7    14.5 m      10.0 s       92.3%    15.7 m      72.0 s
    INFO  22:41:31,811 ProgressMeter -  chr22:37313550   8.5186273E7    15.0 m      10.0 s       95.3%    15.7 m      44.0 s
    INFO  22:42:01,814 ProgressMeter -  chrX:129751854   8.8512125E7    15.5 m      10.0 s       99.0%    15.7 m       9.0 s
    Successfully validated the input file.  Checked 183045 records with no failures.
    INFO  22:42:08,037 ProgressMeter -            done   8.93941E7    15.6 m      10.0 s      100.0%    15.6 m       0.0 s
    INFO  22:42:08,038 ProgressMeter - Total runtime 936.32 secs, 15.61 min, 0.26 hours
    ------------------------------------------------------------------------------------------
    Done. There were 2 WARN messages, the first 2 are repeated below.
    WARN  22:26:31,610 IndexDictionaryUtils - Track dbsnp doesn't have a sequence dictionary built in, skipping dictionary validation
    WARN  22:26:31,718 ValidateVariants - GVCF format is currently incompatible with allele validation. Not validating Alleles.
    ------------------------------------------------------------------------------------------
    
    
  • mmokrejsmmokrejs Czech RepublicMember
    edited May 2017

    So in the end, ValidateVariants found no errors in normal g.vcf files nor in my AS.g.vcf files, although in the latter case probably did not check the contents properly.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Ok, well this at least tells us there's nothing obviously wrong with your GVCF files, which is part of our basic troubleshooting procedure.

    (We'll see what we can do re: checking for intervals upfront)

    Next step is we'll check with the dev team whether there's a requirement that's somehow missing. We don't yet have much experience with the AS functionality on the support side as it's not a standard part of the workflows.
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Alright, there's nothing obvious that's coming up so it's probably a bug. Would you be able to share a snippet of your data that reproduces the error? Instructions are here: https://software.broadinstitute.org/gatk/documentation/article?id=1894

Sign In or Register to comment.