ERROR MESSAGE: Graph must have ref source and sink vertices

suskraemsuskraem EdinburghMember

Hello,

I am running GATK 3.5.-0 and encountered the following error:
ERROR MESSAGE: Graph must have ref source and sink vertices

The error occurs towards the end of the file processing and the resulting gvcf file seems rather complete to me, but I'd like to make sure.

Any help is greatly appreciated,
Susanne

INFO 15:31:52,917 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalStateException: Graph must have ref source and sink vertices
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.graphs.BaseGraph.removePathsNotConnectedToRef(BaseGraph.java:576)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.createGraph(ReadThreadingAssembler.java:211)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.assemble(ReadThreadingAssembler.java:127)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.LocalAssemblyEngine.runLocalAssembly(LocalAssemblyEngine.java:169)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:1029)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:865)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:228)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:274)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Graph must have ref source and sink vertices
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Can you please post the command line you ran? Are you running this on a GVCF file or on a VCF file?

  • suskraemsuskraem EdinburghMember

    Hi Geraldine,
    here is the command. I am running this on a .bam file, desired out put is a gvcf.
    Thanks

    java -jar ~/Applications/GenomeAnalysisTK.jar -R Chlamydomonas.MA.MP.scaffolds.fasta -T HaplotypeCaller -I C1_stampy.bam --emitRefConfidence GVCF -o C1.raw.snps.indels.g.vcf

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Ah ok, this looked like an error from VQSR. That's why it helps if you post the command line right away.

    Have you validated your bam file? And what is the assembly status of your reference? (How many contigs, avg length, N50 etc)

  • suskraemsuskraem EdinburghMember

    here are the stats from the reference:
    N50 43268
    Number of contigs in N50 700
    Max_contig_size 855296
    Number of contigs (>200) 23127
    Number of bases in contigs (>200) 132563229
    Number of contigs >=1kb 12318
    Number of bases in contigs >=1kb 127611722
    Number of contigs >=10kb 2107
    Number of bases in contigs >=10kb 96661687
    GC Content of contigs 65.8291946102188
    Reads used -/-
    Expected coverage -
    Coverage cutoff -

    I validated the bam file using picard tools:
    java -Xmx2g -jar ~/Applications/picard-tools-2.1.1/picard.jar AddOrReplaceReadGroups INPUT=C1_stampy OUTPUT=C1_stampy.bam CREATE_INDEX=True VALIDATION_STRINGENCY=LENIENT SORT_ORDER=coordinate RGID=C1 RGLB=Cincerta RGPL=illumina RGSM=C1 RGPU=unit1

  • suskraemsuskraem EdinburghMember

    here is the command for the validation step
    java -Xmx2g -jar ~/Applications/picard-tools-2.1.1/picard.jar FixMateInformation INPUT=C1_stampy.bam VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=True

Sign In or Register to comment.