Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on October 14, 2019, due to the U.S. holiday. We will return to monitoring the forum on October 15.

Picard LiftoverVcf Duplicate allele added to VariantContext

james_lawlorjames_lawlor Huntsville, ALMember

Hi,
I've run into a frustrating problem: when lifting over certain VCFs from b37 to hg38, I'm running into LiftoverVcf exiting with errors like Exception in thread "main" java.lang.IllegalArgumentException: Duplicate allele added to VariantContext

I've got it isolated to an example failing variant, but I'm at a loss for how to fix or prevent this error, since they seem scattered among VCFs I've generated with GATK 3.8-1-0-gf15c1c3ef GenotypeGVCFs.

Picard version 2.20.2

$ java -jar picard.jar LiftoverVcf --version
2.20.2-SNAPSHOT

Failing variant example:

$ java -jar picard.jar LiftoverVcf C=/gpfs/gpfs2/cooperlab/resources/liftover_chain_files/b37ToHg38.over.chain.gz I=bad_simple.vcf O=test.vcf R=hg38.fa REJECT=test_b37.reject.vcf
INFO    2019-06-22 00:35:10 LiftoverVcf 

********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
**********    LiftoverVcf -C /gpfs/gpfs2/cooperlab/resources/liftover_chain_files/b37ToHg38.over.chain.gz -I bad_simple.vcf -O test.vcf -R hg38.fa -REJECT test_b37.reject.vcf
**********


00:35:10.862 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/gpfs1/home/jlawlor/test_liftover/round_4/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Sat Jun 22 00:35:10 CDT 2019] LiftoverVcf INPUT=bad_simple.vcf OUTPUT=test.vcf CHAIN=/gpfs/gpfs2/cooperlab/resources/liftover_chain_files/b37ToHg38.over.chain.gz REJECT=test_b37.reject.vcf REFERENCE_SEQUENCE=hg38.fa    WARN_ON_MISSING_CONTIG=false LOG_FAILED_INTERVALS=true WRITE_ORIGINAL_POSITION=false WRITE_ORIGINAL_ALLELES=false LIFTOVER_MIN_MATCH=1.0 ALLOW_MISSING_FIELDS_IN_HEADER=false RECOVER_SWAPPED_REF_ALT=false TAGS_TO_REVERSE=[AF] TAGS_TO_DROP=[MAX_AF] DISABLE_SORT=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Sat Jun 22 00:35:10 CDT 2019] Executing as [email protected] on Linux 3.10.0-327.3.1.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_102-b14; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.2-SNAPSHOT
INFO    2019-06-22 00:35:11 LiftoverVcf Loading up the target reference genome.
INFO    2019-06-22 00:35:21 LiftoverVcf Lifting variants over and sorting (not yet writing the output file.)
[Sat Jun 22 00:35:21 CDT 2019] picard.vcf.LiftoverVcf done. Elapsed time: 0.18 minutes.
Runtime.totalMemory()=6996623360
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.IllegalArgumentException: Duplicate allele added to VariantContext: T
    at htsjdk.variant.variantcontext.VariantContext.makeAlleles(VariantContext.java:1493)
    at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:379)
    at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:579)
    at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:573)
    at picard.util.LiftoverUtils.liftVariant(LiftoverUtils.java:117)
    at picard.vcf.LiftoverVcf.doWork(LiftoverVcf.java:426)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)

with VCF bad_simple.vcf

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
##DRAGENCommandLine=<ID=dragen,Version="SW: 01.011.269.3.2.8, HW: 01.011.269",Date="Tue May 21 12:16:34 CDT 2019",CommandLineOptions="-f -r /staging/reference/GRCh37/GRCh37.fa.k_21.f_16.m_149 --fastq-list /staging/fastq/SL385519_fastqs/SL385519_list.csv --output-directory /staging/bam/ --output-file-prefix SL385519 --enable-duplicate-marking true --enable-map-align-output true --enable-variant-caller true --vc-sample-name SL385519 --vc-emit-ref-confidence GVCF --dbsnp /staging/reference/GRCh37/dbsnp_135.b37.vcf">
##FILTER=<ID=DRAGENHardQUAL,Description="Set if true:QUAL < 10.4139">
##FILTER=<ID=LowDepth,Description="Set if true:DP < 1">
##FILTER=<ID=LowGQ,Description="Set if true:GQ = 0">
##FILTER=<ID=PloidyConflict,Description="Genotype call from variant caller not consistent with chromosome ploidy">
##FILTER=<ID=VQSRTrancheINDEL99.00to99.90,Description="Truth sensitivity tranche level for INDEL model at VQS Lod: -12.1756 <= x < -1.3496">
##FILTER=<ID=VQSRTrancheINDEL99.90to100.00+,Description="Truth sensitivity tranche level for INDEL model at VQS Lod < -1409.7427">
##FILTER=<ID=VQSRTrancheINDEL99.90to100.00,Description="Truth sensitivity tranche level for INDEL model at VQS Lod: -1409.7427 <= x < -12.1756">
##FILTER=<ID=VQSRTrancheSNP99.00to99.90,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -4.6589 <= x < 0.343">
##FILTER=<ID=VQSRTrancheSNP99.90to100.00+,Description="Truth sensitivity tranche level for SNP model at VQS Lod < -39592.3492">
##FILTER=<ID=VQSRTrancheSNP99.90to100.00,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -39592.3492 <= x < -4.6589">
##FILTER=<ID=lod_fstar,Description="Variant does not meet likelihood threshold (default threshold is 6.3)">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=AF,Number=A,Type=Float,Description="Allelic frequency for alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=F1R2,Number=R,Type=Integer,Description="Count of reads in F1R2 pair orientation supporting each allele">
##FORMAT=<ID=F2R1,Number=R,Type=Integer,Description="Count of reads in F2R1 pair orientation supporting each allele">
##FORMAT=<ID=FT,Number=.,Type=String,Description="Genotype-level filter">
##FORMAT=<ID=GL,Number=G,Type=Float,Description="Normalized likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=GP,Number=G,Type=Float,Description="Phred-scaled posterior probabilities for genotypes as defined in the VCF specification">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=ICNT,Number=2,Type=Integer,Description="Counts of INDEL informative reads based on the reference confidence model">
##FORMAT=<ID=LOD,Number=1,Type=Float,Description="Per-sample variant LOD score">
##FORMAT=<ID=MB,Number=4,Type=Integer,Description="Per-sample component statistics to detect mate bias">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=PP,Number=G,Type=Integer,Description="Phred-scaled posterior genotype probabilities">
##FORMAT=<ID=PRI,Number=G,Type=Float,Description="Phred-scaled prior probabilities for genotypes">
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
##FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional reference genotype confidence, encoded as a phred quality -10*log10 p(genotype call is wrong)">
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias">
##FORMAT=<ID=SPL,Number=.,Type=Integer,Description="Normalized, Phred-scaled likelihoods for SNPs based on the reference confidence model">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP Membership">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
##INFO=<ID=FGT,Number=0,Type=Flag,Description="ForceGT variant call">
##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect strand bias">
##INFO=<ID=FractionInformativeReads,Number=1,Type=Float,Description="The fraction of informative reads out of the total reads">
##INFO=<ID=HaplotypeScore,Number=1,Type=Float,Description="Consistency of the site with at most two segregating haplotypes">
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
##INFO=<ID=LOD,Number=1,Type=Float,Description="Variant LOD score">
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
##INFO=<ID=NEGATIVE_TRAIN_SITE,Number=0,Type=Flag,Description="This variant was used to build the negative training set of bad variants">
##INFO=<ID=NML,Number=0,Type=Flag,Description="Normal (non-ForceGT) variant call">
##INFO=<ID=POSITIVE_TRAIN_SITE,Number=0,Type=Flag,Description="This variant was used to build the positive training set of good variants">
##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth">
##INFO=<ID=R2_5P_bias,Number=1,Type=Float,Description="Score based on mate bias and distance from 5 prime end">
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
##INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of 2x2 contingency table to detect strand bias">
##INFO=<ID=VQSLOD,Number=1,Type=Float,Description="Log odds of being a true variant versus being false under the trained gaussian mixture model">
##INFO=<ID=culprit,Number=1,Type=String,Description="The annotation which was the worst performing in the Gaussian mixture model, likely the reason why the variant was filtered out">
##contig=<ID=1,length=249250621>
##contig=<ID=2,length=243199373>
##contig=<ID=3,length=198022430>
##contig=<ID=4,length=191154276>
##contig=<ID=5,length=180915260>
##contig=<ID=6,length=171115067>
##contig=<ID=7,length=159138663>
##contig=<ID=8,length=146364022>
##contig=<ID=9,length=141213431>
##contig=<ID=10,length=135534747>
##contig=<ID=11,length=135006516>
##contig=<ID=12,length=133851895>
##contig=<ID=13,length=115169878>
##contig=<ID=14,length=107349540>
##contig=<ID=15,length=102531392>
##contig=<ID=16,length=90354753>
##contig=<ID=17,length=81195210>
##contig=<ID=18,length=78077248>
##contig=<ID=19,length=59128983>
##contig=<ID=20,length=63025520>
##contig=<ID=21,length=48129895>
##contig=<ID=22,length=51304566>
##contig=<ID=X,length=155270560>
##contig=<ID=Y,length=59373566>
##contig=<ID=MT,length=16569>
##contig=<ID=GL000207.1,length=4262>
##contig=<ID=GL000226.1,length=15008>
##contig=<ID=GL000229.1,length=19913>
##contig=<ID=GL000231.1,length=27386>
##contig=<ID=GL000210.1,length=27682>
##contig=<ID=GL000239.1,length=33824>
##contig=<ID=GL000235.1,length=34474>
##contig=<ID=GL000201.1,length=36148>
##contig=<ID=GL000247.1,length=36422>
##contig=<ID=GL000245.1,length=36651>
##contig=<ID=GL000197.1,length=37175>
##contig=<ID=GL000203.1,length=37498>
##contig=<ID=GL000246.1,length=38154>
##contig=<ID=GL000249.1,length=38502>
##contig=<ID=GL000196.1,length=38914>
##contig=<ID=GL000248.1,length=39786>
##contig=<ID=GL000244.1,length=39929>
##contig=<ID=GL000238.1,length=39939>
##contig=<ID=GL000202.1,length=40103>
##contig=<ID=GL000234.1,length=40531>
##contig=<ID=GL000232.1,length=40652>
##contig=<ID=GL000206.1,length=41001>
##contig=<ID=GL000240.1,length=41933>
##contig=<ID=GL000236.1,length=41934>
##contig=<ID=GL000241.1,length=42152>
##contig=<ID=GL000243.1,length=43341>
##contig=<ID=GL000242.1,length=43523>
##contig=<ID=GL000230.1,length=43691>
##contig=<ID=GL000237.1,length=45867>
##contig=<ID=GL000233.1,length=45941>
##contig=<ID=GL000204.1,length=81310>
##contig=<ID=GL000198.1,length=90085>
##contig=<ID=GL000208.1,length=92689>
##contig=<ID=GL000191.1,length=106433>
##contig=<ID=GL000227.1,length=128374>
##contig=<ID=GL000228.1,length=129120>
##contig=<ID=GL000214.1,length=137718>
##contig=<ID=GL000221.1,length=155397>
##contig=<ID=GL000209.1,length=159169>
##contig=<ID=GL000218.1,length=161147>
##contig=<ID=GL000220.1,length=161802>
##contig=<ID=GL000213.1,length=164239>
##contig=<ID=GL000211.1,length=166566>
##contig=<ID=GL000199.1,length=169874>
##contig=<ID=GL000217.1,length=172149>
##contig=<ID=GL000216.1,length=172294>
##contig=<ID=GL000215.1,length=172545>
##contig=<ID=GL000205.1,length=174588>
##contig=<ID=GL000219.1,length=179198>
##contig=<ID=GL000224.1,length=179693>
##contig=<ID=GL000223.1,length=180455>
##contig=<ID=GL000195.1,length=182896>
##contig=<ID=GL000212.1,length=186858>
##contig=<ID=GL000222.1,length=186861>
##contig=<ID=GL000200.1,length=187035>
##contig=<ID=GL000193.1,length=189789>
##contig=<ID=GL000194.1,length=191469>
##contig=<ID=GL000225.1,length=211173>
##contig=<ID=GL000192.1,length=547496>
##reference=file:///gpfs/gpfs1/myerslab/reference/genomes/bwa-0.7.8/GRCh37.fa
##bcftools_viewVersion=1.7+htslib-1.4.1
##bcftools_viewCommand=view -h batch_65.vcf.gz; Date=Fri Jun 21 23:56:28 2019
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
1   143283417   .   ACCG    A,* 512.03  VQSRTrancheINDEL99.00to99.90    .

Successful variant example
It doesn't seem to be solely a problem with indels or * alternates, because this variant (from nearby) has no problems:

java -jar picard.jar LiftoverVcf C=/gpfs/gpfs2/cooperlab/resources/liftover_chain_files/b37ToHg38.over.chain.gz I=good_simple.vcf O=test.vcf R=hg38.fa REJECT=test_b37.reject.vcf
INFO    2019-06-22 00:34:30 LiftoverVcf 

********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
**********    LiftoverVcf -C /gpfs/gpfs2/cooperlab/resources/liftover_chain_files/b37ToHg38.over.chain.gz -I good_simple.vcf -O test.vcf -R hg38.fa -REJECT test_b37.reject.vcf
**********


00:34:30.734 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/gpfs1/home/jlawlor/test_liftover/round_4/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Sat Jun 22 00:34:30 CDT 2019] LiftoverVcf INPUT=good_simple.vcf OUTPUT=test.vcf CHAIN=/gpfs/gpfs2/cooperlab/resources/liftover_chain_files/b37ToHg38.over.chain.gz REJECT=test_b37.reject.vcf REFERENCE_SEQUENCE=hg38.fa    WARN_ON_MISSING_CONTIG=false LOG_FAILED_INTERVALS=true WRITE_ORIGINAL_POSITION=false WRITE_ORIGINAL_ALLELES=false LIFTOVER_MIN_MATCH=1.0 ALLOW_MISSING_FIELDS_IN_HEADER=false RECOVER_SWAPPED_REF_ALT=false TAGS_TO_REVERSE=[AF] TAGS_TO_DROP=[MAX_AF] DISABLE_SORT=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Sat Jun 22 00:34:30 CDT 2019] Executing as [email protected] on Linux 3.10.0-327.3.1.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_102-b14; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.2-SNAPSHOT
INFO    2019-06-22 00:34:30 LiftoverVcf Loading up the target reference genome.
INFO    2019-06-22 00:34:41 LiftoverVcf Lifting variants over and sorting (not yet writing the output file.)
INFO    2019-06-22 00:34:41 LiftoverVcf Processed 1 variants.
INFO    2019-06-22 00:34:41 LiftoverVcf 0 variants failed to liftover.
INFO    2019-06-22 00:34:41 LiftoverVcf 0 variants lifted over but had mismatching reference alleles after lift over.
INFO    2019-06-22 00:34:41 LiftoverVcf 0.0000% of variants were not successfully lifted over and written to the output.
INFO    2019-06-22 00:34:41 LiftoverVcf liftover success by source contig:
INFO    2019-06-22 00:34:41 LiftoverVcf 1: 1 / 1 (100.0000%)
INFO    2019-06-22 00:34:41 LiftoverVcf lifted variants by target contig:
INFO    2019-06-22 00:34:41 LiftoverVcf chr21: 1
WARNING 2019-06-22 00:34:41 LiftoverVcf 0 variants with a swapped REF/ALT were identified, but were not recovered.  See RECOVER_SWAPPED_REF_ALT and associated caveats.
INFO    2019-06-22 00:34:41 LiftoverVcf Writing out sorted records to final VCF.
[Sat Jun 22 00:34:41 CDT 2019] picard.vcf.LiftoverVcf done. Elapsed time: 0.18 minutes.

with VCF good_simple.vcf (same header as previous example)

##bcftools_viewCommand=view -h 65_1.vcf; Date=Sat Jun 22 00:14:22 2019
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
1   143283452   .   A   ACACG,* 275.1   VQSRTrancheINDEL99.00to99.90    .

Resources I'm using:
1. chain file from https://raw.githubusercontent.com/broadinstitute/gatk/master/scripts/funcotator/data_sources/gnomAD/b37ToHg38.over.chain
2. reference from ftp://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
3. Sequence dictionaries from picard CreateSequenceDictionary

File provenance: GRCh37 GVCFs generated by DRAGEN variant caller in --vc-emit-ref-confidence GVCF mode, joint-genotyped with other samples with GATK 3.8-1-0-gf15c1c3ef

I've also tried:
1. Lifting over from b37 -> hg19 (successful) and then hg19 -> hg38 (same failure) using the chain files and hg19 reference from UCSC; all of the above using the reference from the GATK Resource Bundles.
2. Adjusting LIFTOVER_MIN_MATCH which results in no variants successfully mapping (preventing the java error)
3. Adjusting RECOVER_SWAPPED_REF which has no effect on this error
4. CrossMap (v. 3.4 runs into python errors; v. 3.3 has mapping problems with b37 when "chr" isn't used in chromosome names)

Any advice would be appreciated!
Thanks. :)

Issue · Github
by bhanuGandham

Issue Number
1353
State
closed
Last Updated
Closed By
fleharty

Best Answer

Answers

  • bhanuGandhambhanuGandham admin Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @james_lawlor

    Can you please try to run ValidateVariants on your vcf and post the results here.

  • bhanuGandhambhanuGandham admin Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited June 24

    PS: Checkout Terra for end-to-end GATK pipelining solutions and let us know what more pipelines we can add that will make using GATK easier for you! For more details on whether this is the right fit for you checkout our blog page.

    PS: Checkout Terra for end-to-end GATK pipelining solutions and let us know what more pipelines we can add that will make using GATK easier for you! For more details on whether this is the right fit for you checkout our blog page.

    Post edited by bhanuGandham on
  • james_lawlorjames_lawlor Huntsville, ALMember

    For my failing variant example above (1 143283417 . ACCG A,*)

    [[email protected] round_4]$ ~/gatk4/gatk-4.1.2.0/gatk ValidateVariants -V bad_simple.vcf -R /gpfs/gpfs1/myerslab/reference/genomes/bwa-0.7.8/GRCh37.fa -D /gpfs/gpfs2/cooperlab/resources/broad_gatk_bundles/b37/dbsnp_138.b37.vcf.gz
    Using GATK jar /gpfs/gpfs1/home/jlawlor/gatk4/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gpfs/gpfs1/home/jlawlor/gatk4/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar ValidateVariants -V bad_simple.vcf -R /gpfs/gpfs1/myerslab/reference/genomes/bwa-0.7.8/GRCh37.fa -D /gpfs/gpfs2/cooperlab/resources/broad_gatk_bundles/b37/dbsnp_138.b37.vcf.gz
    10:13:32.371 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/gpfs1/home/jlawlor/gatk4/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Jun 27, 2019 10:13:34 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    10:13:34.075 INFO  ValidateVariants - ------------------------------------------------------------
    10:13:34.075 INFO  ValidateVariants - The Genome Analysis Toolkit (GATK) v4.1.2.0
    10:13:34.075 INFO  ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
    10:13:34.076 INFO  ValidateVariants - Executing as [email protected] on Linux v3.10.0-327.3.1.el7.x86_64 amd64
    10:13:34.076 INFO  ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_102-b14
    10:13:34.076 INFO  ValidateVariants - Start Date/Time: June 27, 2019 10:13:32 AM CDT
    10:13:34.076 INFO  ValidateVariants - ------------------------------------------------------------
    10:13:34.076 INFO  ValidateVariants - ------------------------------------------------------------
    10:13:34.076 INFO  ValidateVariants - HTSJDK Version: 2.19.0
    10:13:34.077 INFO  ValidateVariants - Picard Version: 2.19.0
    10:13:34.077 INFO  ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    10:13:34.077 INFO  ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    10:13:34.077 INFO  ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    10:13:34.077 INFO  ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    10:13:34.077 INFO  ValidateVariants - Deflater: IntelDeflater
    10:13:34.077 INFO  ValidateVariants - Inflater: IntelInflater
    10:13:34.077 INFO  ValidateVariants - GCS max retries/reopens: 20
    10:13:34.077 INFO  ValidateVariants - Requester pays: disabled
    10:13:34.077 INFO  ValidateVariants - Initializing engine
    10:13:34.424 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/gpfs2/cooperlab/resources/broad_gatk_bundles/b37/dbsnp_138.b37.vcf.gz
    10:13:34.542 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/gpfs1/home/jlawlor/test_liftover/round_4/bad_simple.vcf
    10:13:34.559 INFO  ValidateVariants - Done initializing engine
    10:13:34.559 INFO  ProgressMeter - Starting traversal
    10:13:34.559 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
    10:13:34.637 INFO  ProgressMeter -             unmapped              0.0                     1            779.2
    10:13:34.637 INFO  ProgressMeter - Traversal complete. Processed 1 total variants in 0.0 minutes.
    10:13:34.637 INFO  ValidateVariants - Shutting down engine
    [June 27, 2019 10:13:34 AM CDT] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=2356674560
    
  • james_lawlorjames_lawlor Huntsville, ALMember

    @bhanuGandham
    And here's the result from the full joint-genotyped VCF where I pulled that example from.

    [[email protected] round_4]$ ~/gatk4/gatk-4.1.2.0/gatk ValidateVariants -V batch_65.vcf.gz -R /gpfs/gpfs1/myerslab/reference/genomes/bwa-0.7.8/GRCh37.fa -D /gpfs/gpfs2/cooperlab/resources/broad_gatk_bundles/b37/dbsnp_138.b37.vcf.gz
    Using GATK jar /gpfs/gpfs1/home/jlawlor/gatk4/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gpfs/gpfs1/home/jlawlor/gatk4/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar ValidateVariants -V batch_65.vcf.gz -R /gpfs/gpfs1/myerslab/reference/genomes/bwa-0.7.8/GRCh37.fa -D /gpfs/gpfs2/cooperlab/resources/broad_gatk_bundles/b37/dbsnp_138.b37.vcf.gz
    10:15:44.511 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/gpfs1/home/jlawlor/gatk4/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Jun 27, 2019 10:15:46 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    10:15:46.220 INFO  ValidateVariants - ------------------------------------------------------------
    10:15:46.220 INFO  ValidateVariants - The Genome Analysis Toolkit (GATK) v4.1.2.0
    10:15:46.220 INFO  ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
    10:15:46.221 INFO  ValidateVariants - Executing as [email protected] on Linux v3.10.0-327.3.1.el7.x86_64 amd64
    10:15:46.221 INFO  ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_102-b14
    10:15:46.222 INFO  ValidateVariants - Start Date/Time: June 27, 2019 10:15:44 AM CDT
    10:15:46.222 INFO  ValidateVariants - ------------------------------------------------------------
    10:15:46.222 INFO  ValidateVariants - ------------------------------------------------------------
    10:15:46.222 INFO  ValidateVariants - HTSJDK Version: 2.19.0
    10:15:46.223 INFO  ValidateVariants - Picard Version: 2.19.0
    10:15:46.223 INFO  ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    10:15:46.223 INFO  ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    10:15:46.223 INFO  ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    10:15:46.223 INFO  ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    10:15:46.223 INFO  ValidateVariants - Deflater: IntelDeflater
    10:15:46.224 INFO  ValidateVariants - Inflater: IntelInflater
    10:15:46.224 INFO  ValidateVariants - GCS max retries/reopens: 20
    10:15:46.224 INFO  ValidateVariants - Requester pays: disabled
    10:15:46.224 INFO  ValidateVariants - Initializing engine
    10:15:46.563 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/gpfs2/cooperlab/resources/broad_gatk_bundles/b37/dbsnp_138.b37.vcf.gz
    10:15:46.730 INFO  FeatureManager - Using codec VCFCodec to read file file:///gpfs/gpfs1/home/jlawlor/test_liftover/round_4/batch_65.vcf.gz
    10:15:46.899 INFO  ValidateVariants - Done initializing engine
    10:15:46.899 INFO  ProgressMeter - Starting traversal
    10:15:46.899 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
    10:15:56.945 INFO  ProgressMeter -           1:17269714              0.2                123000         734913.4
    10:16:06.975 INFO  ProgressMeter -           1:39530623              0.3                239000         714321.3
    10:16:17.002 INFO  ProgressMeter -           1:63476360              0.5                355000         707594.2
    10:16:27.065 INFO  ProgressMeter -           1:87295124              0.7                482000         720012.0
    10:16:37.090 INFO  ProgressMeter -          1:110872045              0.8                610000         729228.9
    10:16:47.169 INFO  ProgressMeter -          1:156160822              1.0                751000         747635.6
    10:16:57.170 INFO  ProgressMeter -          1:179903119              1.2                878000         749679.8
    10:17:07.240 INFO  ProgressMeter -          1:203931093              1.3               1006000         751297.6
    10:17:17.274 INFO  ProgressMeter -          1:227224291              1.5               1133000         752199.2
    10:17:27.314 INFO  ProgressMeter -          1:247359065              1.7               1262000         754070.6
    10:17:37.403 INFO  ProgressMeter -           2:18161193              1.8               1395000         757438.6
    10:17:47.469 INFO  ProgressMeter -           2:41201923              2.0               1523000         757900.0
    10:17:57.511 INFO  ProgressMeter -           2:63005866              2.2               1652000         758894.7
    10:18:07.556 INFO  ProgressMeter -           2:86303551              2.3               1781000         759720.5
    10:18:17.596 INFO  ProgressMeter -          2:116797081              2.5               1919000         764049.7
    10:18:27.652 INFO  ProgressMeter -          2:139766583              2.7               2048000         764402.5
    10:18:37.696 INFO  ProgressMeter -          2:165606489              2.8               2176000         764416.2
    10:18:47.752 INFO  ProgressMeter -          2:189884748              3.0               2302000         763714.2
    10:18:57.758 INFO  ProgressMeter -          2:216422277              3.2               2428000         763290.0
    10:19:07.777 INFO  ProgressMeter -          2:238452796              3.3               2557000         763751.0
    10:19:17.835 INFO  ProgressMeter -           3:13651433              3.5               2689000         764876.6
    10:19:27.897 INFO  ProgressMeter -           3:35807885              3.7               2820000         765617.8
    10:19:37.956 INFO  ProgressMeter -           3:61525137              3.9               2946000         765006.0
    10:19:48.003 INFO  ProgressMeter -           3:84408843              4.0               3072000         764483.4
    10:19:58.075 INFO  ProgressMeter -          3:111450465              4.2               3198000         763929.5
    10:20:08.099 INFO  ProgressMeter -          3:134855120              4.4               3324000         763555.8
    10:20:18.147 INFO  ProgressMeter -          3:159147079              4.5               3453000         763802.9
    10:20:28.148 INFO  ProgressMeter -          3:182707324              4.7               3582000         764165.4
    10:20:38.184 INFO  ProgressMeter -            4:4095460              4.9               3713000         764818.0
    10:20:48.255 INFO  ProgressMeter -           4:22536479              5.0               3842000         764945.0
    10:20:58.271 INFO  ProgressMeter -           4:43843694              5.2               3973000         765584.4
    10:21:08.284 INFO  ProgressMeter -           4:67363806              5.4               4107000         766743.9
    10:21:18.337 INFO  ProgressMeter -           4:90919767              5.5               4239000         767383.3
    10:21:28.351 INFO  ProgressMeter -          4:116405423              5.7               4370000         767899.3
    10:21:38.380 INFO  ProgressMeter -          4:139248634              5.9               4503000         768690.2
    10:21:48.458 INFO  ProgressMeter -          4:162518019              6.0               4625000         767509.6
    10:21:58.500 INFO  ProgressMeter -          4:182192995              6.2               4748000         766630.8
    10:22:08.525 INFO  ProgressMeter -            5:6322093              6.4               4872000         765985.5
    10:22:18.547 INFO  ProgressMeter -           5:26658397              6.5               4992000         764770.3
    10:22:28.567 INFO  ProgressMeter -           5:51571223              6.7               5119000         764661.4
    10:22:38.648 INFO  ProgressMeter -           5:77672618              6.9               5254000         765612.1
    10:22:48.690 INFO  ProgressMeter -          5:102868864              7.0               5381000         765450.2
    10:22:58.767 INFO  ProgressMeter -          5:123797645              7.2               5508000         765235.6
    10:23:08.795 INFO  ProgressMeter -          5:148619833              7.4               5635000         765113.9
    10:23:18.869 INFO  ProgressMeter -          5:171764120              7.5               5764000         765183.5
    10:23:28.918 INFO  ProgressMeter -           6:10760671              7.7               5896000         765682.8
    10:23:38.961 INFO  ProgressMeter -           6:30703203              7.9               6027000         766045.1
    10:23:48.972 INFO  ProgressMeter -           6:47011052              8.0               6162000         766937.8
    10:23:59.028 INFO  ProgressMeter -           6:71345648              8.2               6290000         766872.1
    10:24:09.057 INFO  ProgressMeter -           6:93163453              8.4               6414000         766373.9
    10:24:19.104 INFO  ProgressMeter -          6:117792489              8.5               6545000         766685.2
    10:24:29.141 INFO  ProgressMeter -          6:142738293              8.7               6676000         767002.2
    10:24:39.140 INFO  ProgressMeter -          6:164919085              8.9               6806000         767246.4
    10:24:49.160 INFO  ProgressMeter -            7:9141331              9.0               6935000         767342.7
    10:24:59.232 INFO  ProgressMeter -           7:27777923              9.2               7059000         766820.0
    10:25:09.265 INFO  ProgressMeter -           7:48738522              9.4               7182000         766263.9
    10:25:19.307 INFO  ProgressMeter -           7:70238809              9.5               7311000         766342.8
    10:25:29.336 INFO  ProgressMeter -           7:92721750              9.7               7436000         766022.8
    10:25:39.403 INFO  ProgressMeter -          7:117523429              9.9               7561000         765665.7
    10:25:49.430 INFO  ProgressMeter -          7:141774604             10.0               7685000         765271.8
    10:25:59.475 INFO  ProgressMeter -             8:758987             10.2               7818000         765749.9
    10:26:09.481 INFO  ProgressMeter -           8:13221809             10.4               7954000         766550.9
    10:26:19.522 INFO  ProgressMeter -           8:31413544             10.5               8080000         766334.4
    10:26:29.545 INFO  ProgressMeter -           8:59142293             10.7               8207000         766240.7
    10:26:39.583 INFO  ProgressMeter -           8:83446712             10.9               8334000         766128.8
    10:26:49.644 INFO  ProgressMeter -          8:108541328             11.0               8460000         765905.4
    10:26:59.678 INFO  ProgressMeter -          8:132660339             11.2               8585000         765631.5
    10:27:09.731 INFO  ProgressMeter -            9:5725235             11.4               8715000         765781.3
    10:27:19.749 INFO  ProgressMeter -           9:24891649             11.5               8842000         765706.9
    10:27:29.773 INFO  ProgressMeter -           9:70504840             11.7               9007000         768871.8
    10:27:39.797 INFO  ProgressMeter -           9:93817802             11.9               9138000         769086.2
    10:27:49.811 INFO  ProgressMeter -          9:117083305             12.0               9267000         769139.3
    10:27:59.817 INFO  ProgressMeter -          9:139502510             12.2               9396000         769199.3
    10:28:09.855 INFO  ProgressMeter -          10:16050534             12.4               9526000         769306.4
    10:28:19.890 INFO  ProgressMeter -          10:36850478             12.5               9656000         769411.6
    10:28:29.925 INFO  ProgressMeter -          10:61915102             12.7               9787000         769594.7
    10:28:39.936 INFO  ProgressMeter -          10:84252756             12.9               9913000         769406.9
    10:28:49.980 INFO  ProgressMeter -         10:107864125             13.1              10038000         769116.8
    10:29:00.012 INFO  ProgressMeter -         10:129520977             13.2              10163000         768844.8
    10:29:10.014 INFO  ProgressMeter -          11:11686645             13.4              10296000         769204.9
    10:29:20.055 INFO  ProgressMeter -          11:34101452             13.6              10425000         769225.1
    10:29:30.086 INFO  ProgressMeter -          11:58043623             13.7              10557000         769473.7
    10:29:40.109 INFO  ProgressMeter -          11:81595405             13.9              10685000         769434.8
    10:29:50.155 INFO  ProgressMeter -         11:103906678             14.1              10815000         769517.2
    10:30:00.180 INFO  ProgressMeter -         11:127797244             14.2              10939000         769195.6
    10:30:10.210 INFO  ProgressMeter -          12:12986974             14.4              11067000         769155.0
    10:30:20.226 INFO  ProgressMeter -          12:34574700             14.6              11195000         769128.6
    10:30:30.278 INFO  ProgressMeter -          12:59749438             14.7              11319000         768798.0
    10:30:40.304 INFO  ProgressMeter -          12:83216935             14.9              11448000         768833.8
    10:30:50.341 INFO  ProgressMeter -         12:106033114             15.1              11575000         768727.6
    10:31:00.395 INFO  ProgressMeter -         12:127965137             15.2              11699000         768411.4
    10:31:10.406 INFO  ProgressMeter -          13:29902787             15.4              11826000         768332.9
    10:31:20.406 INFO  ProgressMeter -          13:53252429             15.6              11952000         768199.9
    10:31:30.458 INFO  ProgressMeter -          13:76521297             15.7              12082000         768283.5
    10:31:40.472 INFO  ProgressMeter -          13:99014234             15.9              12210000         768269.2
    10:31:50.503 INFO  ProgressMeter -          14:21136890             16.1              12340000         768365.4
    10:32:00.548 INFO  ProgressMeter -          14:43338730             16.2              12468000         768326.2
    10:32:10.566 INFO  ProgressMeter -          14:66480286             16.4              12596000         768308.8
    10:32:20.630 INFO  ProgressMeter -          14:89710432             16.6              12725000         768317.3
    10:32:30.700 INFO  ProgressMeter -          15:22455692             16.7              12861000         768738.0
    10:32:40.789 INFO  ProgressMeter -          15:43715171             16.9              12991000         768781.6
    10:32:50.800 INFO  ProgressMeter -          15:66804196             17.1              13119000         768765.7
    10:33:00.827 INFO  ProgressMeter -          15:91183034             17.2              13249000         768854.3
    10:33:10.910 INFO  ProgressMeter -           16:6886366             17.4              13380000         768958.2
    10:33:20.922 INFO  ProgressMeter -          16:26740096             17.6              13514000         769281.1
    10:33:30.940 INFO  ProgressMeter -          16:62232026             17.7              13641000         769199.7
    10:33:40.949 INFO  ProgressMeter -          16:82287173             17.9              13763000         768846.9
    10:33:50.991 INFO  ProgressMeter -           17:8052525             18.1              13897000         769142.1
    10:34:01.038 INFO  ProgressMeter -          17:34943347             18.2              14028000         769262.4
    10:34:11.110 INFO  ProgressMeter -          17:61066108             18.4              14158000         769310.2
    10:34:21.202 INFO  ProgressMeter -            18:524306             18.6              14290000         769450.3
    10:34:31.213 INFO  ProgressMeter -          18:24379355             18.7              14423000         769696.0
    10:34:41.254 INFO  ProgressMeter -          18:49307480             18.9              14554000         769811.9
    10:34:51.274 INFO  ProgressMeter -          18:70748336             19.1              14684000         769887.5
    10:35:01.340 INFO  ProgressMeter -          19:10107626             19.2              14820000         770242.9
    10:35:11.358 INFO  ProgressMeter -          19:33035849             19.4              14950000         770314.8
    10:35:21.368 INFO  ProgressMeter -          19:53427789             19.6              15081000         770441.8
    10:35:31.378 INFO  ProgressMeter -          20:13895934             19.7              15210000         770466.0
    10:35:41.446 INFO  ProgressMeter -          20:41619027             19.9              15336000         770300.4
    10:35:51.446 INFO  ProgressMeter -          20:62269362             20.1              15464000         770281.3
    10:36:01.498 INFO  ProgressMeter -          21:30298251             20.2              15599000         770575.9
    10:36:11.555 INFO  ProgressMeter -          22:18325067             20.4              15734000         770861.4
    10:36:21.591 INFO  ProgressMeter -          22:39245618             20.6              15871000         771253.1
    10:36:31.593 INFO  ProgressMeter -           X:11794878             20.7              16031000         772768.2
    10:36:41.606 INFO  ProgressMeter -           X:50224195             20.9              16167000         773105.4
    10:36:51.632 INFO  ProgressMeter -           X:90038216             21.1              16307000         773618.4
    10:37:01.686 INFO  ProgressMeter -          X:128089251             21.2              16442000         773870.5
    10:37:10.593 INFO  ProgressMeter -           Y:59003852             21.4              16578944         774901.7
    10:37:10.594 INFO  ProgressMeter - Traversal complete. Processed 16578944 total variants in 21.4 minutes.
    10:37:10.594 INFO  ValidateVariants - Shutting down engine
    [June 27, 2019 10:37:10 AM CDT] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 21.44 minutes.
    Runtime.totalMemory()=3468689408
    

    I'm assuming that with ValidateVariants no particular informative output is good news.

  • james_lawlorjames_lawlor Huntsville, ALMember

    Thanks @bhanuGandham! I built a jar from that source and it solved the problem.

  • bhanuGandhambhanuGandham admin Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @james_lawlor

    There is a new release of picard at, https://github.com/broadinstitute/picard/releases/tag/2.20.3
    that should resolve this issue as well if you are interested.

Sign In or Register to comment.