Best strategy to "fix" the Haplotype Caller - GenotypeGVCF "missing DP field" bug??

Hi,

I've run into the (already reported http://gatkforums.broadinstitute.org/dsde/discussion/5598/missing-depth-dp-after-haplotypecaller ) bug of the missing DP format field in my callings.

I've run the following (relevant) commands:

Haplotype Caller -> Generate GVCF:

    java -Xmx${xmx} ${gct} -Djava.io.tmpdir=${NEWTMPDIR} -jar ${gatkpath}/GenomeAnalysisTK.jar \
       -T HaplotypeCaller \
       -R ${ref} \
       -I ${NEWTMPDIR}/${prefix}.realigned.fixed.recal.bam \
       -L ${reg} \
       -ERC GVCF \
       -nct ${nct} \
       --genotyping_mode DISCOVERY \
       -stand_emit_conf 10 \
       -stand_call_conf 30  \
       -o ${prefix}.raw_variants.annotated.g.vcf \
       -A QualByDepth -A RMSMappingQuality -A MappingQualityRankSumTest -A ReadPosRankSumTest -A FisherStrand -A StrandOddsRatio -A Coverage

That generates GVCF files that DO HAVE the DP field for all reference positions, but DO NOT HAVE the DP format field for any called variant (but still keep the DP in the INFO field):

18      11255   .       T       <NON_REF>       .       .       END=11256       GT:DP:GQ:MIN_DP:PL      0/0:18:48:18:0,48,720
18      11257   .       C       G,<NON_REF>     229.77  .       BaseQRankSum=1.999;DP=20;MLEAC=1,0;MLEAF=0.500,0.00;MQ=60.00;MQRankSum=-1.377;ReadPosRankSum=0.489      GT:AD:GQ:PL:SB  0/1:10,8,0:99:258,0,308,288
18      11258   .       G       <NON_REF>       .       .       END=11260       GT:DP:GQ:MIN_DP:PL      0/0:17:48:16:0,48,530

Later, I ran Genotype GVCF joining all the samples with the following command:

java -Xmx${xmx} ${gct} -Djava.io.tmpdir=${NEWTMPDIR} -jar ${gatkpath}/GenomeAnalysisTK.jar \
   -T GenotypeGVCFs \
   -R ${ref} \
   -L ${pos} \
   -o ${prefix}.raw_variants.annotated.vcf \
   --variant ${variant} [...]

This generated vcf files where the DP field is present in the format description, it IS present in the Homozygous REF samples, but IS MISSING in any Heterozygous or HomoALT samples.

22  17280388    .   T   C   18459.8 PASS    AC=34;AF=0.340;AN=100;BaseQRankSum=-2.179e+00;DP=1593;FS=2.526;InbreedingCoeff=0.0196;MLEAC=34;MLEAF=0.340;MQ=60.00;MQRankSum=0.196;QD=19.76;ReadPosRankSum=-9.400e-02;SOR=0.523    GT:AD:DP:GQ:PL  0/0:29,0:29:81:0,81,1118    0/1:20,22:.:99:688,0,682    1/1:0,27:.:81:1018,81,0 0/0:22,0:22:60:0,60,869 0/1:20,10:.:99:286,0,664    0/1:11,17:.:99:532,0,330    0/1:14,14:.:99:431,0,458    0/0:28,0:28:81:0,81,1092    0/0:35,0:35:99:0,99,1326    0/1:14,20:.:99:631,0,453    0/1:13,16:.:99:511,0,423    0/1:38,29:.:99:845,0,1231   0/1:20,10:.:99:282,0,671    0/0:22,0:22:63:0,63,837 0/1:8,15:.:99:497,0,248 0/0:32,0:32:90:0,90,1350    0/1:12,12:.:99:378,0,391    0/1:14,26:.:99:865,0,433    0/0:37,0:37:99:0,105,1406   0/0:44,0:44:99:0,120,1800   0/0:24,0:24:72:0,72,877 0/0:30,0:30:84:0,84,1250    0/0:31,0:31:90:0,90,1350    0/1:15,25:.:99:827,0,462    0/0:35,0:35:99:0,99,1445    0/0:29,0:29:72:0,72,1089    1/1:0,32:.:96:1164,96,0 0/0:21,0:21:63:0,63,809 0/1:21,15:.:99:450,0,718    1/1:0,40:.:99:1539,120,0    0/0:20,0:20:60:0,60,765 0/1:11,9:.:99:293,0,381 1/1:0,35:.:99:1306,105,0    0/1:18,14:.:99:428,0,606    0/0:32,0:32:90:0,90,1158    0/1:24,22:.:99:652,0,816    0/0:20,0:20:60:0,60,740 1/1:0,30:.:90:1120,90,0 0/1:15,13:.:99:415,0,501    0/0:31,0:31:90:0,90,1350    0/1:15,18:.:99:570,0,480    0/1:22,13:.:99:384,0,742    0/1:19,11:.:99:318,0,632    0/0:28,0:28:75:0,75,1125    0/0:20,0:20:60:0,60,785 1/1:0,27:.:81:1030,81,0 0/0:30,0:30:90:0,90,1108    0/1:16,16:.:99:479,0,493    0/1:14,22:.:99:745,0,439    0/0:31,0:31:90:0,90,1252
22  17280822    .   G   A   5491.56 PASS    AC=8;AF=0.080;AN=100;BaseQRankSum=1.21;DP=1651;FS=0.000;InbreedingCoeff=-0.0870;MLEAC=8;MLEAF=0.080;MQ=60.00;MQRankSum=0.453;QD=17.89;ReadPosRankSum=-1.380e-01;SOR=0.695   GT:AD:DP:GQ:PL  0/0:27,0:27:72:0,72,1080    0/0:34,0:34:90:0,90,1350    0/1:15,16:.:99:528,0,491    0/0:27,0:27:60:0,60,900 0/1:15,22:.:99:699,0,453    0/0:32,0:32:90:0,90,1350    0/0:37,0:37:99:0,99,1485    0/0:31,0:31:87:0,87,1305    0/0:40,0:40:99:0,108,1620   0/1:20,9:.:99:258,0,652 0/0:26,0:26:72:0,72,954 0/1:16,29:.:99:943,0,476    0/0:27,0:27:69:0,69,1035    0/0:19,0:19:48:0,48,720 0/0:32,0:32:81:0,81,1215    0/0:36,0:36:99:0,99,1435    0/0:34,0:34:99:0,99,1299    0/0:35,0:35:99:0,102,1339   0/0:38,0:38:99:0,102,1520   0/0:36,0:36:99:0,99,1476    0/0:31,0:31:81:0,81,1215    0/0:31,0:31:75:0,75,1125    0/0:35,0:35:99:0,99,1485    0/0:37,0:37:99:0,99,1485    0/0:35,0:35:90:0,90,1350    0/0:20,0:20:28:0,28,708 0/1:16,22:.:99:733,0,474    0/0:32,0:32:90:0,90,1350    0/0:35,0:35:99:0,99,1467    0/1:27,36:.:99:1169,0,831   0/0:28,0:28:75:0,75,1125    0/0:36,0:36:81:0,81,1215    0/0:35,0:35:90:0,90,1350    0/0:28,0:28:72:0,72,1080    0/0:31,0:31:81:0,81,1215    0/0:37,0:37:99:0,99,1485    0/0:31,0:31:84:0,84,1260    0/0:39,0:39:99:0,101,1575   0/0:37,0:37:96:0,96,1440    0/0:34,0:34:99:0,99,1269    0/0:30,0:30:81:0,81,1215    0/0:36,0:36:99:0,99,1485    0/1:17,17:.:99:567,0,530    0/0:26,0:26:72:0,72,1008    0/0:18,0:18:45:0,45,675 0/0:33,0:33:84:0,84,1260    0/0:25,0:25:61:0,61,877 0/1:9,21:.:99:706,0,243 0/0:35,0:35:81:0,81,1215    0/0:35,0:35:99:0,99,1485

I've just discovered this issue, and I need to run an analysis trying on the differential depth of coverage in different regions, and if there is a DP bias between called/not-called samples.

I have thousands of files and I've spent almost 1 year generating all these callings, so redoing the callings is not an option.

What would be the best/fastest strategy to either fix my final vcfs with the DP data present in all intermediate gvcf files (preferably) or, at least, extracting this data for all snps and samples?

Thanks in advance,

Txema

PS: Recalling the individual samples from bamfiles is not an option. Fixing the individual gvcfs and redoing the joint GenotypeGVCFs could be.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi Txema,

    As I see it you have two options: use the sum of ADs as a proxy for "usable depth", or run a DepthOfCoverage analysis on your samples. Both have strengths and weaknesses, and address slightly different aspects of the coverage question.

    1. Use the sum of ADs
      The advantage here is that you already have the data in your VCF. The downside is that the AD doesn't include all of the reads that were seen, just the ones corresponding to the alleles that were called. But this is a pretty good proxy if you're just looking to see if you have significant discrepancy in the amount of useable data available between samples.

    2. Run a DepthOfCoverage analysis
      This will collect coverage statistics per-sample and allow you to evaluate whether there is any substantial bias in the amount of sequence produced between samples. The caveat is that it will take some time to run on thousands of samples (though you can parallellize massively, and optimize by just running on the sites you're looking for), and you will be collecting coverage from the original alignments, as opposed to the alignments that result from the graph assembly process performed internally by HaplotypeCaller. The latter is the bigger potential problem: there will be some sites where the coverage stats you get are inconsistent with the AD stats of the call. But the number of sites showing a big difference should be reasonably small and mostly affect indels. And you can mitigate the problem by averaging the coverage stats over some region around each site of interest. This will be less precise per-site but should enable you to determine whether there is substantial difference in the total amount of data.

  • The DepthOfCoverage is completely out of scope. That would imply re-downloading ~120TB of bamfiles.

    The sum of ADs seems promising, but it would be cool to be able to see how many non-REF-non-ALT alleles were seen. Specially sample-by-sample to check if any of them has an excess of "error" calls.

    Do you have any recommendation on what tool to use for rewriting the vcfs without messing them up?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    You could grab the ADs from the GVCFs to get the full picture if that's what you want. VariantAnnotator can annotate from an external resource, but off the top of my head I don't remember it doing sample annotations, only site-level. I could be wrong though, feel free to check the tech doc.

  • shawpashawpa Member

    I am running into a similar issue as the above user. I just noticed that I am only getting DP output at homozygous reference sites. I get a "." at heterozygous or homozygous alternate sites. The issue with this is that all my downstream analysis programs are unable to filter by depth because of this "missing" output. When I ran haplotype caller with default annotations, I didn't have this problem. I am now running haplotype caller with an additional annotation for allele balance by sample. I saw in some other threads that this missing DP is a known issue and that possibly the bug has been fixed. The DP annotation is in my vcf header and also appears in the format field. Is there some way to force it to recalculate it from the AD field?

  • shawpashawpa Member

    I don't know if anyone from GATK staff has seen my recent post. The only solution I have come up with is to 1) run the haplotype caller with only the default annotations 2) filter genotypes for minimum depth 3) apply the additional annotation (AlleleBalanceBySample).

    I really don't want to have to do this as I calculate that it will take about 3 weeks to rerun haplotype caller on all of my samples due to computing limitations. If there is no "quicker fix" then please let me know.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shawpa
    Hi,

    Is this happening in GATK 4.0.0.0? Can you post some example sites with and without the DP field (using default and non-default annotations)? I may need you to submit a bug report.

    Thanks,
    Sheila

  • shawpashawpa Member

    Because this is an ongoing study, I am still using version 3.4. I know the bug was reported before. Is there a version where this is fixed?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shawpa
    Hi,

    It looks like the fix should be in version 3.8. Can you test that version and let us know if it is fixed?

    Thanks,
    Sheila

  • shawpashawpa Member

    It will take me some time to test. If v 3.8 works, I still need to rerun HaplotypeCaller with version 3.8 or can I use the existing g.vcfs from HaplotypeCaller (v3.4) and just run GenotypeGVCFs?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shawpa
    Hi,

    Unfortunately, you will need to re-run with version 3.8 as there were some changes to GenotypeGVCFs in the later versions that may cause issues with old GVCFs.

    -Sheila

  • shawpashawpa Member

    Would you recommend upgrading to v3.8 or v4.0? I would like to still be able to use the bam files that I generated previously or else it is going to take a very long time to start from scratch.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shawpa

    Hi,

    If you are going to upgrade, it would be best to upgrade to 4.0 :smiley:

    -Sheila

  • shawpashawpa Member

    I don't think I can upgrade to 4.0 at this time because it is missing an annotation that I need (AlleleBalanceBySample). I cannot find the download link for version 3.8. The link on the main page takes you to the 4.0 download and 3.8 is not under the archived versions.

    Issue · Github
    by Sheila

    Issue Number
    2832
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shawpa
    Hi,

    Sorry about that. The 3.8 version should be available soon.

    -Sheila

  • YatrosYatros Seattle, WA, USAMember
    edited March 29

    I know that this is an old issue, but I would like to know which is the current status of HaplotypeCaller not outputting the DP value in the FORMAT field.

    I was running HaplotypeCaller and I observed the same problem as the person that started this post.

    HaplotypeCaller generates gvcf files where the DP field is present in the format description, it IS present in the Homozygous REF samples:
    1 14816 . T <NON_REF> . . END=14906 GT:DP:GQ:MIN_DP:PL 0/0:349:99:257:0,120,1800

    However, DP field IS MISSING in any Heterozygous or HomoALT samples.
    1 14907 . A G,<NON_REF> 7863.77 . AS_RAW_BaseQRankSum=|-1.6,1|NaN;AS_RAW_MQ=321278.00|334603.00|0.00;AS_RAW_MQRankSum=|2.3,1|NaN;AS_RAW_ReadPosRankSum=|2.5,1|NaN;AS_SB_TABLE=101,137|128,91|0,0;BaseQRankSum=-1.552;DP=457;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=2.311;RAW_MQ=655881.00;ReadPosRankSum=2.593 GT:AD:GQ:PGT:PID:PL:SB 0/1:238,219,0:99:0|1:14907_A_G:7892,0,9224,8613,9895,18508:101,137,128,91

    I can see that DP is in a different part of the annotation for this variant DP=457 , but I wanted to know if it is going to be added to the FORMAT field in future GATK releases or not, since some of my downstream analyses take this value from the FORMAT field. If not, I would have to rewrite several scripts.

    By the way, I'm using the latest broadinstitute/gatk docker container version of GATK (Using GATK jar /gatk/build/libs/gatk-package-4.0.3.0-local.jar) and cromwell-30.2.jar versions on a Linux Ubuntu machine.

    Thanks in advance,

    Yatros

  • YatrosYatros Seattle, WA, USAMember

    Hello again. An additional note,

    As somebody pointed earlier, this behavior happens only when you apply to variant calls one or more specific additional annotations with HaplotypeCaller:

    command {
        ${gatk_path} --java-options "${java_opt}" \
        HaplotypeCaller \
          -R ${human_g1k_v37_decoy_fa} \
          -O ${gvcf_basename}.vcf.gz \
          -I ${input_bam} \
          -L ${interval_list} \
          --interval-padding 500 \
          --max-alternate-alleles 3 \
          --emit-ref-confidence GVCF \
           --annotation-group StandardAnnotation \
          --annotation-group AS_StandardAnnotation
       }
    

    1 14464 . A T,<NON_REF> 679.77 . AS_RAW_BaseQRankSum=|-1.8,1|NaN;AS_RAW_MQ=5734.00|24538.00|0.00;AS_RAW_MQRankSum=|0.9,1|NaN;AS_RAW_ReadPosRankSum=|3.0,1|NaN;AS_SB_TABLE=0,9|12,16|0,0;BaseQRankSum=-1.762;DP=38;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.906;RAW_MQ=30848.00;ReadPosRankSum=3.050 GT:AD:GQ:PL:SB 0/1:9,28,0:99:708,0,157,735,241,976:0,9,12,16

    If I remove the annotation-group options as follows,

    command {
        ${gatk_path} --java-options "${java_opt}" \
        HaplotypeCaller \
          -R ${human_g1k_v37_decoy_fa} \
          -O ${gvcf_basename}.vcf.gz \
          -I ${input_bam} \
          -L ${interval_list} \
          --interval-padding 500 \
          --max-alternate-alleles 3 \
          --emit-ref-confidence GVCF \
        #   --annotation-group StandardAnnotation \
        #  --annotation-group AS_StandardAnnotation
        }
    

    I get the DP value in the FORMAT field again:
    1 14464 . A T,<NON_REF> 679.77 . BaseQRankSum=-1.762;ClippingRankSum=0.000;DP=38;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.906;RAW_MQ=30848.00;ReadPosRankSum=3.050 GT:AD:DP:GQ:PL:SB 0/1:9,28,0:37:99:708,0,157,735,241,976:0,9,12,16

    Is there any way of changing this issue besides removing the AS_StandardAnnotation? Can this be considered a bug that needs to be fixed?

    I can send you small fastq files and my code to test this issue if you need them. At the end of the post I have included the headers for both HaplotypeCaller gvcf files so that you can compare them and check the steps I have done before.

    Thank you very much,

    Best,

    Yatros

    Header of the gvcf file with annotation options:

    ##fileformat=VCFv4.2
    ##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
    ##FILTER=<ID=LowQual,Description="Low quality">
    ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
    ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
    ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
    ##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
    ##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
    ##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
    ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
    ##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
    ##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller  --annotation-group StandardAnnotation --annotation-group AS_StandardAnnotation --emit-ref-confidence GVCF --max-alternate-alleles 3 --output Sample_01.vcf.gz --intervals /cromwell-executions/Mapping_stampy/04dc4d3f-ef1c-4e1e-be44-b50d4438a69a/call-HaplotypeCaller/inputs/mnt/user/Test_reference/Agilent_SureSelect_V5_UTRs.interval_list --interval-padding 500 --input /cromwell-executions/Mapping_stampy/04dc4d3f-ef1c-4e1e-be44-b50d4438a69a/call-HaplotypeCaller/inputs/mnt/user/Test_family/BAM/Test/cromwell-executions/Mapping_stampy/04dc4d3f-ef1c-4e1e-be44-b50d4438a69a/call-apply_BQSR/execution/Sample_01_post_recal.bam --reference /cromwell-executions/Mapping_stampy/04dc4d3f-ef1c-4e1e-be44-b50d4438a69a/call-HaplotypeCaller/inputs/mnt/user/Test_reference/human_g1k_v37_decoy.fasta  --disable-tool-default-annotations false --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --indel-size-to-eliminate-in-ref-model 10 --use-alleles-trigger false --disable-optimizations false --just-determine-active-regions false --dont-genotype false --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --recover-dangling-heads false --do-not-recover-dangling-branches false --min-dangling-branch-length 4 --consensus false --max-num-haplotypes-in-population 128 --error-correct-kmers false --min-pruning 2 --debug-graph-transformations false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --debug false --use-filtered-reads-for-annotations false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --capture-assembly-failure-bam false --error-correct-reads false --do-not-run-physical-phasing false --min-base-quality-score 10 --smith-waterman JAVA --use-new-qual-calculator false --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 10.0 --max-genotype-count 1024 --sample-ploidy 2 --genotyping-mode DISCOVERY --contamination-fraction-to-filter 0.0 --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --interval-set-rule UNION --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --disable-tool-default-read-filters false --minimum-mapping-quality 20",Version=4.0.3.0,Date="March 29, 2018 5:15:50 PM UTC">
    ##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)
    ##GVCFBlock1-2=minGQ=1(inclusive),maxGQ=2(exclusive)
    ##GVCFBlock10-11=minGQ=10(inclusive),maxGQ=11(exclusive)
    ##GVCFBlock11-12=minGQ=11(inclusive),maxGQ=12(exclusive)
    ##GVCFBlock12-13=minGQ=12(inclusive),maxGQ=13(exclusive)
    ##GVCFBlock13-14=minGQ=13(inclusive),maxGQ=14(exclusive)
    ##GVCFBlock14-15=minGQ=14(inclusive),maxGQ=15(exclusive)
    ##GVCFBlock15-16=minGQ=15(inclusive),maxGQ=16(exclusive)
    ##GVCFBlock16-17=minGQ=16(inclusive),maxGQ=17(exclusive)
    ##GVCFBlock17-18=minGQ=17(inclusive),maxGQ=18(exclusive)
    ##GVCFBlock18-19=minGQ=18(inclusive),maxGQ=19(exclusive)
    ##GVCFBlock19-20=minGQ=19(inclusive),maxGQ=20(exclusive)
    ##GVCFBlock2-3=minGQ=2(inclusive),maxGQ=3(exclusive)
    ##GVCFBlock20-21=minGQ=20(inclusive),maxGQ=21(exclusive)
    ##GVCFBlock21-22=minGQ=21(inclusive),maxGQ=22(exclusive)
    ##GVCFBlock22-23=minGQ=22(inclusive),maxGQ=23(exclusive)
    ##GVCFBlock23-24=minGQ=23(inclusive),maxGQ=24(exclusive)
    ##GVCFBlock24-25=minGQ=24(inclusive),maxGQ=25(exclusive)
    ##GVCFBlock25-26=minGQ=25(inclusive),maxGQ=26(exclusive)
    ##GVCFBlock26-27=minGQ=26(inclusive),maxGQ=27(exclusive)
    ##GVCFBlock27-28=minGQ=27(inclusive),maxGQ=28(exclusive)
    ##GVCFBlock28-29=minGQ=28(inclusive),maxGQ=29(exclusive)
    ##GVCFBlock29-30=minGQ=29(inclusive),maxGQ=30(exclusive)
    ##GVCFBlock3-4=minGQ=3(inclusive),maxGQ=4(exclusive)
    ##GVCFBlock30-31=minGQ=30(inclusive),maxGQ=31(exclusive)
    ##GVCFBlock31-32=minGQ=31(inclusive),maxGQ=32(exclusive)
    ##GVCFBlock32-33=minGQ=32(inclusive),maxGQ=33(exclusive)
    ##GVCFBlock33-34=minGQ=33(inclusive),maxGQ=34(exclusive)
    ##GVCFBlock34-35=minGQ=34(inclusive),maxGQ=35(exclusive)
    ##GVCFBlock35-36=minGQ=35(inclusive),maxGQ=36(exclusive)
    ##GVCFBlock36-37=minGQ=36(inclusive),maxGQ=37(exclusive)
    ##GVCFBlock37-38=minGQ=37(inclusive),maxGQ=38(exclusive)
    ##GVCFBlock38-39=minGQ=38(inclusive),maxGQ=39(exclusive)
    ##GVCFBlock39-40=minGQ=39(inclusive),maxGQ=40(exclusive)
    ##GVCFBlock4-5=minGQ=4(inclusive),maxGQ=5(exclusive)
    ##GVCFBlock40-41=minGQ=40(inclusive),maxGQ=41(exclusive)
    ##GVCFBlock41-42=minGQ=41(inclusive),maxGQ=42(exclusive)
    ##GVCFBlock42-43=minGQ=42(inclusive),maxGQ=43(exclusive)
    ##GVCFBlock43-44=minGQ=43(inclusive),maxGQ=44(exclusive)
    ##GVCFBlock44-45=minGQ=44(inclusive),maxGQ=45(exclusive)
    ##GVCFBlock45-46=minGQ=45(inclusive),maxGQ=46(exclusive)
    ##GVCFBlock46-47=minGQ=46(inclusive),maxGQ=47(exclusive)
    ##GVCFBlock47-48=minGQ=47(inclusive),maxGQ=48(exclusive)
    ##GVCFBlock48-49=minGQ=48(inclusive),maxGQ=49(exclusive)
    ##GVCFBlock49-50=minGQ=49(inclusive),maxGQ=50(exclusive)
    ##GVCFBlock5-6=minGQ=5(inclusive),maxGQ=6(exclusive)
    ##GVCFBlock50-51=minGQ=50(inclusive),maxGQ=51(exclusive)
    ##GVCFBlock51-52=minGQ=51(inclusive),maxGQ=52(exclusive)
    ##GVCFBlock52-53=minGQ=52(inclusive),maxGQ=53(exclusive)
    ##GVCFBlock53-54=minGQ=53(inclusive),maxGQ=54(exclusive)
    ##GVCFBlock54-55=minGQ=54(inclusive),maxGQ=55(exclusive)
    ##GVCFBlock55-56=minGQ=55(inclusive),maxGQ=56(exclusive)
    ##GVCFBlock56-57=minGQ=56(inclusive),maxGQ=57(exclusive)
    ##GVCFBlock57-58=minGQ=57(inclusive),maxGQ=58(exclusive)
    ##GVCFBlock58-59=minGQ=58(inclusive),maxGQ=59(exclusive)
    ##GVCFBlock59-60=minGQ=59(inclusive),maxGQ=60(exclusive)
    ##GVCFBlock6-7=minGQ=6(inclusive),maxGQ=7(exclusive)
    ##GVCFBlock60-70=minGQ=60(inclusive),maxGQ=70(exclusive)
    ##GVCFBlock7-8=minGQ=7(inclusive),maxGQ=8(exclusive)
    ##GVCFBlock70-80=minGQ=70(inclusive),maxGQ=80(exclusive)
    ##GVCFBlock8-9=minGQ=8(inclusive),maxGQ=9(exclusive)
    ##GVCFBlock80-90=minGQ=80(inclusive),maxGQ=90(exclusive)
    ##GVCFBlock9-10=minGQ=9(inclusive),maxGQ=10(exclusive)
    ##GVCFBlock90-99=minGQ=90(inclusive),maxGQ=99(exclusive)
    ##GVCFBlock99-100=minGQ=99(inclusive),maxGQ=100(exclusive)
    ##INFO=<ID=AS_InbreedingCoeff,Number=A,Type=Float,Description="allele specific heterozygosity as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation; relate to inbreeding coefficient">
    ##INFO=<ID=AS_QD,Number=A,Type=Float,Description="Allele-specific Variant Confidence/Quality by Depth">
    ##INFO=<ID=AS_RAW_BaseQRankSum,Number=1,Type=String,Description="raw data for allele specific rank sum test of base qualities">
    ##INFO=<ID=AS_RAW_MQ,Number=1,Type=String,Description="Allele-specfic raw data for RMS Mapping Quality">
    ##INFO=<ID=AS_RAW_MQRankSum,Number=1,Type=String,Description="Allele-specfic raw data for Mapping Quality Rank Sum">
    ##INFO=<ID=AS_RAW_ReadPosRankSum,Number=1,Type=String,Description="allele specific raw data for rank sum test of read position bias">
    ##INFO=<ID=AS_SB_TABLE,Number=1,Type=String,Description="Allele-specific forward/reverse read counts for strand bias tests">
    ##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
    ##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
    ##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
    ##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
    ##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
    ##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
    ##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
    ##INFO=<ID=RAW_MQ,Number=1,Type=Float,Description="Raw data for RMS Mapping Quality">
    ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
    ##contig=<ID=1,length=249250621>
    ##contig=<ID=2,length=243199373>
    ##contig=<ID=3,length=198022430>
    ##contig=<ID=4,length=191154276>
    ##contig=<ID=5,length=180915260>
    ##contig=<ID=6,length=171115067>
    ##contig=<ID=7,length=159138663>
    ##contig=<ID=8,length=146364022>
    ##contig=<ID=9,length=141213431>
    ##contig=<ID=10,length=135534747>
    ##contig=<ID=11,length=135006516>
    ##contig=<ID=12,length=133851895>
    ##contig=<ID=13,length=115169878>
    ##contig=<ID=14,length=107349540>
    ##contig=<ID=15,length=102531392>
    ##contig=<ID=16,length=90354753>
    ##contig=<ID=17,length=81195210>
    ##contig=<ID=18,length=78077248>
    ##contig=<ID=19,length=59128983>
    ##contig=<ID=20,length=63025520>
    ##contig=<ID=21,length=48129895>
    ##contig=<ID=22,length=51304566>
    ##contig=<ID=X,length=155270560>
    ##contig=<ID=Y,length=59373566>
    ##contig=<ID=MT,length=16569>
    ##contig=<ID=GL000207.1,length=4262>
    ##contig=<ID=GL000226.1,length=15008>
    ##contig=<ID=GL000229.1,length=19913>
    ##contig=<ID=GL000231.1,length=27386>
    ##contig=<ID=GL000210.1,length=27682>
    ##contig=<ID=GL000239.1,length=33824>
    ##contig=<ID=GL000235.1,length=34474>
    ##contig=<ID=GL000201.1,length=36148>
    ##contig=<ID=GL000247.1,length=36422>
    ##contig=<ID=GL000245.1,length=36651>
    ##contig=<ID=GL000197.1,length=37175>
    ##contig=<ID=GL000203.1,length=37498>
    ##contig=<ID=GL000246.1,length=38154>
    ##contig=<ID=GL000249.1,length=38502>
    ##contig=<ID=GL000196.1,length=38914>
    ##contig=<ID=GL000248.1,length=39786>
    ##contig=<ID=GL000244.1,length=39929>
    ##contig=<ID=GL000238.1,length=39939>
    ##contig=<ID=GL000202.1,length=40103>
    ##contig=<ID=GL000234.1,length=40531>
    ##contig=<ID=GL000232.1,length=40652>
    ##contig=<ID=GL000206.1,length=41001>
    ##contig=<ID=GL000240.1,length=41933>
    ##contig=<ID=GL000236.1,length=41934>
    ##contig=<ID=GL000241.1,length=42152>
    ##contig=<ID=GL000243.1,length=43341>
    ##contig=<ID=GL000242.1,length=43523>
    ##contig=<ID=GL000230.1,length=43691>
    ##contig=<ID=GL000237.1,length=45867>
    ##contig=<ID=GL000233.1,length=45941>
    ##contig=<ID=GL000204.1,length=81310>
    ##contig=<ID=GL000198.1,length=90085>
    ##contig=<ID=GL000208.1,length=92689>
    ##contig=<ID=GL000191.1,length=106433>
    ##contig=<ID=GL000227.1,length=128374>
    ##contig=<ID=GL000228.1,length=129120>
    ##contig=<ID=GL000214.1,length=137718>
    ##contig=<ID=GL000221.1,length=155397>
    ##contig=<ID=GL000209.1,length=159169>
    ##contig=<ID=GL000218.1,length=161147>
    ##contig=<ID=GL000220.1,length=161802>
    ##contig=<ID=GL000213.1,length=164239>
    ##contig=<ID=GL000211.1,length=166566>
    ##contig=<ID=GL000199.1,length=169874>
    ##contig=<ID=GL000217.1,length=172149>
    ##contig=<ID=GL000216.1,length=172294>
    ##contig=<ID=GL000215.1,length=172545>
    ##contig=<ID=GL000205.1,length=174588>
    ##contig=<ID=GL000219.1,length=179198>
    ##contig=<ID=GL000224.1,length=179693>
    ##contig=<ID=GL000223.1,length=180455>
    ##contig=<ID=GL000195.1,length=182896>
    ##contig=<ID=GL000212.1,length=186858>
    ##contig=<ID=GL000222.1,length=186861>
    ##contig=<ID=GL000200.1,length=187035>
    ##contig=<ID=GL000193.1,length=189789>
    ##contig=<ID=GL000194.1,length=191469>
    ##contig=<ID=GL000225.1,length=211173>
    ##contig=<ID=GL000192.1,length=547496>
    ##contig=<ID=NC_007605,length=171823>
    ##contig=<ID=hs37d5,length=35477943>
    ##source=HaplotypeCaller
    #CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample_01
    

    Header of the gvcf file without additional annotation options:

    ##fileformat=VCFv4.2
    ##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
    ##FILTER=<ID=LowQual,Description="Low quality">
    ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
    ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
    ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
    ##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
    ##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
    ##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
    ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
    ##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
    ##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller  --emit-ref-confidence GVCF --max-alternate-alleles 3 --output Sample_01.vcf.gz --intervals /cromwell-executions/Mapping_stampy/ad962101-3294-4042-b620-faa0993b5858/call-HaplotypeCaller/inputs/mnt/user/Test_reference/Agilent_SureSelect_V5_UTRs.interval_list --interval-padding 500 --input /cromwell-executions/Mapping_stampy/ad962101-3294-4042-b620-faa0993b5858/call-HaplotypeCaller/inputs/mnt/user/Test_family/BAM/Test/cromwell-executions/Mapping_stampy/ad962101-3294-4042-b620-faa0993b5858/call-apply_BQSR/execution/Sample_01_post_recal.bam --reference /cromwell-executions/Mapping_stampy/ad962101-3294-4042-b620-faa0993b5858/call-HaplotypeCaller/inputs/mnt/user/Test_reference/human_g1k_v37_decoy.fasta  --annotation-group StandardAnnotation --annotation-group StandardHCAnnotation --disable-tool-default-annotations false --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --indel-size-to-eliminate-in-ref-model 10 --use-alleles-trigger false --disable-optimizations false --just-determine-active-regions false --dont-genotype false --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --recover-dangling-heads false --do-not-recover-dangling-branches false --min-dangling-branch-length 4 --consensus false --max-num-haplotypes-in-population 128 --error-correct-kmers false --min-pruning 2 --debug-graph-transformations false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --debug false --use-filtered-reads-for-annotations false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --capture-assembly-failure-bam false --error-correct-reads false --do-not-run-physical-phasing false --min-base-quality-score 10 --smith-waterman JAVA --use-new-qual-calculator false --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 10.0 --max-genotype-count 1024 --sample-ploidy 2 --genotyping-mode DISCOVERY --contamination-fraction-to-filter 0.0 --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --interval-set-rule UNION --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --disable-tool-default-read-filters false --minimum-mapping-quality 20",Version=4.0.3.0,Date="March 30, 2018 4:00:08 PM UTC">
    ##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)
    ##GVCFBlock1-2=minGQ=1(inclusive),maxGQ=2(exclusive)
    ##GVCFBlock10-11=minGQ=10(inclusive),maxGQ=11(exclusive)
    ##GVCFBlock11-12=minGQ=11(inclusive),maxGQ=12(exclusive)
    ##GVCFBlock12-13=minGQ=12(inclusive),maxGQ=13(exclusive)
    ##GVCFBlock13-14=minGQ=13(inclusive),maxGQ=14(exclusive)
    ##GVCFBlock14-15=minGQ=14(inclusive),maxGQ=15(exclusive)
    ##GVCFBlock15-16=minGQ=15(inclusive),maxGQ=16(exclusive)
    ##GVCFBlock16-17=minGQ=16(inclusive),maxGQ=17(exclusive)
    ##GVCFBlock17-18=minGQ=17(inclusive),maxGQ=18(exclusive)
    ##GVCFBlock18-19=minGQ=18(inclusive),maxGQ=19(exclusive)
    ##GVCFBlock19-20=minGQ=19(inclusive),maxGQ=20(exclusive)
    ##GVCFBlock2-3=minGQ=2(inclusive),maxGQ=3(exclusive)
    ##GVCFBlock20-21=minGQ=20(inclusive),maxGQ=21(exclusive)
    ##GVCFBlock21-22=minGQ=21(inclusive),maxGQ=22(exclusive)
    ##GVCFBlock22-23=minGQ=22(inclusive),maxGQ=23(exclusive)
    ##GVCFBlock23-24=minGQ=23(inclusive),maxGQ=24(exclusive)
    ##GVCFBlock24-25=minGQ=24(inclusive),maxGQ=25(exclusive)
    ##GVCFBlock25-26=minGQ=25(inclusive),maxGQ=26(exclusive)
    ##GVCFBlock26-27=minGQ=26(inclusive),maxGQ=27(exclusive)
    ##GVCFBlock27-28=minGQ=27(inclusive),maxGQ=28(exclusive)
    ##GVCFBlock28-29=minGQ=28(inclusive),maxGQ=29(exclusive)
    ##GVCFBlock29-30=minGQ=29(inclusive),maxGQ=30(exclusive)
    ##GVCFBlock3-4=minGQ=3(inclusive),maxGQ=4(exclusive)
    ##GVCFBlock30-31=minGQ=30(inclusive),maxGQ=31(exclusive)
    ##GVCFBlock31-32=minGQ=31(inclusive),maxGQ=32(exclusive)
    ##GVCFBlock32-33=minGQ=32(inclusive),maxGQ=33(exclusive)
    ##GVCFBlock33-34=minGQ=33(inclusive),maxGQ=34(exclusive)
    ##GVCFBlock34-35=minGQ=34(inclusive),maxGQ=35(exclusive)
    ##GVCFBlock35-36=minGQ=35(inclusive),maxGQ=36(exclusive)
    ##GVCFBlock36-37=minGQ=36(inclusive),maxGQ=37(exclusive)
    ##GVCFBlock37-38=minGQ=37(inclusive),maxGQ=38(exclusive)
    ##GVCFBlock38-39=minGQ=38(inclusive),maxGQ=39(exclusive)
    ##GVCFBlock39-40=minGQ=39(inclusive),maxGQ=40(exclusive)
    ##GVCFBlock4-5=minGQ=4(inclusive),maxGQ=5(exclusive)
    ##GVCFBlock40-41=minGQ=40(inclusive),maxGQ=41(exclusive)
    ##GVCFBlock41-42=minGQ=41(inclusive),maxGQ=42(exclusive)
    ##GVCFBlock42-43=minGQ=42(inclusive),maxGQ=43(exclusive)
    ##GVCFBlock43-44=minGQ=43(inclusive),maxGQ=44(exclusive)
    ##GVCFBlock44-45=minGQ=44(inclusive),maxGQ=45(exclusive)
    ##GVCFBlock45-46=minGQ=45(inclusive),maxGQ=46(exclusive)
    ##GVCFBlock46-47=minGQ=46(inclusive),maxGQ=47(exclusive)
    ##GVCFBlock47-48=minGQ=47(inclusive),maxGQ=48(exclusive)
    ##GVCFBlock48-49=minGQ=48(inclusive),maxGQ=49(exclusive)
    ##GVCFBlock49-50=minGQ=49(inclusive),maxGQ=50(exclusive)
    ##GVCFBlock5-6=minGQ=5(inclusive),maxGQ=6(exclusive)
    ##GVCFBlock50-51=minGQ=50(inclusive),maxGQ=51(exclusive)
    ##GVCFBlock51-52=minGQ=51(inclusive),maxGQ=52(exclusive)
    ##GVCFBlock52-53=minGQ=52(inclusive),maxGQ=53(exclusive)
    ##GVCFBlock53-54=minGQ=53(inclusive),maxGQ=54(exclusive)
    ##GVCFBlock54-55=minGQ=54(inclusive),maxGQ=55(exclusive)
    ##GVCFBlock55-56=minGQ=55(inclusive),maxGQ=56(exclusive)
    ##GVCFBlock56-57=minGQ=56(inclusive),maxGQ=57(exclusive)
    ##GVCFBlock57-58=minGQ=57(inclusive),maxGQ=58(exclusive)
    ##GVCFBlock58-59=minGQ=58(inclusive),maxGQ=59(exclusive)
    ##GVCFBlock59-60=minGQ=59(inclusive),maxGQ=60(exclusive)
    ##GVCFBlock6-7=minGQ=6(inclusive),maxGQ=7(exclusive)
    ##GVCFBlock60-70=minGQ=60(inclusive),maxGQ=70(exclusive)
    ##GVCFBlock7-8=minGQ=7(inclusive),maxGQ=8(exclusive)
    ##GVCFBlock70-80=minGQ=70(inclusive),maxGQ=80(exclusive)
    ##GVCFBlock8-9=minGQ=8(inclusive),maxGQ=9(exclusive)
    ##GVCFBlock80-90=minGQ=80(inclusive),maxGQ=90(exclusive)
    ##GVCFBlock9-10=minGQ=9(inclusive),maxGQ=10(exclusive)
    ##GVCFBlock90-99=minGQ=90(inclusive),maxGQ=99(exclusive)
    ##GVCFBlock99-100=minGQ=99(inclusive),maxGQ=100(exclusive)
    ##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
    ##INFO=<ID=ClippingRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref number of hard clipped bases">
    ##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
    ##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
    ##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
    ##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
    ##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
    ##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
    ##INFO=<ID=RAW_MQ,Number=1,Type=Float,Description="Raw data for RMS Mapping Quality">
    ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
    ##contig=<ID=1,length=249250621>
    ##contig=<ID=2,length=243199373>
    ##contig=<ID=3,length=198022430>
    ##contig=<ID=4,length=191154276>
    ##contig=<ID=5,length=180915260>
    ##contig=<ID=6,length=171115067>
    ##contig=<ID=7,length=159138663>
    ##contig=<ID=8,length=146364022>
    ##contig=<ID=9,length=141213431>
    ##contig=<ID=10,length=135534747>
    ##contig=<ID=11,length=135006516>
    ##contig=<ID=12,length=133851895>
    ##contig=<ID=13,length=115169878>
    ##contig=<ID=14,length=107349540>
    ##contig=<ID=15,length=102531392>
    ##contig=<ID=16,length=90354753>
    ##contig=<ID=17,length=81195210>
    ##contig=<ID=18,length=78077248>
    ##contig=<ID=19,length=59128983>
    ##contig=<ID=20,length=63025520>
    ##contig=<ID=21,length=48129895>
    ##contig=<ID=22,length=51304566>
    ##contig=<ID=X,length=155270560>
    ##contig=<ID=Y,length=59373566>
    ##contig=<ID=MT,length=16569>
    ##contig=<ID=GL000207.1,length=4262>
    ##contig=<ID=GL000226.1,length=15008>
    ##contig=<ID=GL000229.1,length=19913>
    ##contig=<ID=GL000231.1,length=27386>
    ##contig=<ID=GL000210.1,length=27682>
    ##contig=<ID=GL000239.1,length=33824>
    ##contig=<ID=GL000235.1,length=34474>
    ##contig=<ID=GL000201.1,length=36148>
    ##contig=<ID=GL000247.1,length=36422>
    ##contig=<ID=GL000245.1,length=36651>
    ##contig=<ID=GL000197.1,length=37175>
    ##contig=<ID=GL000203.1,length=37498>
    ##contig=<ID=GL000246.1,length=38154>
    ##contig=<ID=GL000249.1,length=38502>
    ##contig=<ID=GL000196.1,length=38914>
    ##contig=<ID=GL000248.1,length=39786>
    ##contig=<ID=GL000244.1,length=39929>
    ##contig=<ID=GL000238.1,length=39939>
    ##contig=<ID=GL000202.1,length=40103>
    ##contig=<ID=GL000234.1,length=40531>
    ##contig=<ID=GL000232.1,length=40652>
    ##contig=<ID=GL000206.1,length=41001>
    ##contig=<ID=GL000240.1,length=41933>
    ##contig=<ID=GL000236.1,length=41934>
    ##contig=<ID=GL000241.1,length=42152>
    ##contig=<ID=GL000243.1,length=43341>
    ##contig=<ID=GL000242.1,length=43523>
    ##contig=<ID=GL000230.1,length=43691>
    ##contig=<ID=GL000237.1,length=45867>
    ##contig=<ID=GL000233.1,length=45941>
    ##contig=<ID=GL000204.1,length=81310>
    ##contig=<ID=GL000198.1,length=90085>
    ##contig=<ID=GL000208.1,length=92689>
    ##contig=<ID=GL000191.1,length=106433>
    ##contig=<ID=GL000227.1,length=128374>
    ##contig=<ID=GL000228.1,length=129120>
    ##contig=<ID=GL000214.1,length=137718>
    ##contig=<ID=GL000221.1,length=155397>
    ##contig=<ID=GL000209.1,length=159169>
    ##contig=<ID=GL000218.1,length=161147>
    ##contig=<ID=GL000220.1,length=161802>
    ##contig=<ID=GL000213.1,length=164239>
    ##contig=<ID=GL000211.1,length=166566>
    ##contig=<ID=GL000199.1,length=169874>
    ##contig=<ID=GL000217.1,length=172149>
    ##contig=<ID=GL000216.1,length=172294>
    ##contig=<ID=GL000215.1,length=172545>
    ##contig=<ID=GL000205.1,length=174588>
    ##contig=<ID=GL000219.1,length=179198>
    ##contig=<ID=GL000224.1,length=179693>
    ##contig=<ID=GL000223.1,length=180455>
    ##contig=<ID=GL000195.1,length=182896>
    ##contig=<ID=GL000212.1,length=186858>
    ##contig=<ID=GL000222.1,length=186861>
    ##contig=<ID=GL000200.1,length=187035>
    ##contig=<ID=GL000193.1,length=189789>
    ##contig=<ID=GL000194.1,length=191469>
    ##contig=<ID=GL000225.1,length=211173>
    ##contig=<ID=GL000192.1,length=547496>
    ##contig=<ID=NC_007605,length=171823>
    ##contig=<ID=hs37d5,length=35477943>
    ##source=HaplotypeCaller
    #CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample_01
    
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Yatros
    Hi Yatros,

    Thanks for narrowing the issue down. If you could submit a bug report, I can file a new issue ticket. Instructions are here.

    -Sheila

  • YatrosYatros Seattle, WA, USAMember
    edited April 3

    Hi Sheila,

    I just uploaded the zipped file "Yatros_HaplotypeCaller_DP_missing.zip" to your server. It includes the following files:

    • Small BAM file to reproduce problem.
    • Corresponding BAI file.
    • Script file.
    • Log file.
    • Interval_list file I used for this analysis.
    • RTF file with the problem description.

    The reference file I used for this analysis was the "human_g1k_v37_decoy.fasta" file.

    Let me know if you need something else.

    Thanks,

    Yatros

    Issue · Github
    by Sheila

    Issue Number
    3038
    State
    open
    Last Updated
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Yatros
    Hi Yatros,

    Thank you. I will be looking into this asap.

    -Sheila

Sign In or Register to comment.