GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

Tribble issue: Vcf files with single ended breakpoints fail

LouisBLouisB Broad InstitutePosts: 25Member, Third-party Developer, GSA Collaborator, Broadie, Cancer Tools Developer

I'm running into a problem with vcfs that have single ended break ends. (These are produced by an old version of Strelka .) Tribble doesn't recognize "." as valid in alternative alleles.

Single break ends are valid in the vcf standard and the files validate according to Vcftools.

Others have run into this problem as well:
https://groups.google.com/forum/#!searchin/strelka-discuss/gatk/strelka-discuss/gJfsyjZNZXA/ExDXZiVWW_kJ

example error

##### ERROR stack trace
org.broad.tribble.TribbleException: The provided VCF file is malformed at approximately line number 1807: Unparsable vcf record with allele .CCCAGGAGGACTCACTGCCGCTGTCACCTCTGCTGCCACCACTGTTGCCAC, for input source: /cga/tcga-gsc/benchmark/Indels/strelkaPON/NA18606.mapped.ILLUMINA.bwa.CHB.exome.20111114.bam-NA18608.mapped.ILLUMINA.bwa.CHB.exome.20111114.bam/final.indels.vcf
at org.broadinstitute.variant.vcf.AbstractVCFCodec.generateException(AbstractVCFCodec.java:715)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.checkAllele(AbstractVCFCodec.java:527)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.parseSingleAltAllele(AbstractVCFCodec.java:553)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.parseAlleles(AbstractVCFCodec.java:494)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.parseVCFLine(AbstractVCFCodec.java:291)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.decodeLine(AbstractVCFCodec.java:234)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:213)
at org.broadinstitute.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:45)
at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:73)
at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:35)
at org.broad.tribble.TribbleIndexedFeatureReader$WFIterator.readNextRecord(TribbleIndexedFeatureReader.java:284)
at org.broad.tribble.TribbleIndexedFeatureReader$WFIterator.next(TribbleIndexedFeatureReader.java:264)
at org.broad.tribble.TribbleIndexedFeatureReader$WFIterator.next(TribbleIndexedFeatureReader.java:225)
at org.broadinstitute.sting.tools.CatVariants.execute(CatVariants.java:239)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.tools.CatVariants.main(CatVariants.java:258)
##### ERROR ------------------------------------------------------------------------------------------

Example vcf line

19  36002413    .   C   .CCCAGGAGGACTCACTGCCGCTGTCACCTCTGCTGCCACCACTGTTGCCAC    .   PASS    IHP=1;NT=ref;QSI=82;QSI_NT=82;SGT=ref->hom;SOMATIC;SVTYPE=BND;TQSI=1;TQSI_NT=1  DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50   49:49:42,44:0,0:7,6:43.72:0.85:0.00 11:11:0,0:6,6:5,5:14.61:0.48:0.0

A full vcf is available at:
/humgen/gsa-scr1/pub/incoming/BreakendBug/breakend.vcf

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,781Administrator, GATK Dev admin

    Thanks for the bug report -- I've put this in the bug tracker, we'll look into this.

    Geraldine Van der Auwera, PhD

  • LouisBLouisB Broad InstitutePosts: 25Member, Third-party Developer, GSA Collaborator, Broadie, Cancer Tools Developer
    edited October 2013

    Oh well. Thanks for the update. Is there any chance of having those variants just cleanly ignored with a printed warning instead of crashing?

    Not a huge deal either way, but it would save some pain if something like that sneaks into a file.

    Post edited by LouisB on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,781Administrator, GATK Dev admin

    I'll add that as a "bonus points" feature in the bugfix request.

    Geraldine Van der Auwera, PhD

  • LouisBLouisB Broad InstitutePosts: 25Member, Third-party Developer, GSA Collaborator, Broadie, Cancer Tools Developer

    Thanks!

  • LouisBLouisB Broad InstitutePosts: 25Member, Third-party Developer, GSA Collaborator, Broadie, Cancer Tools Developer

    Thanks. I think this will save confusion in the future.

Sign In or Register to comment.