Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Blank VCF Output

AmandaAmanda North CarolinaMember

Hello,

It appears that while there are a small number of positions with errors based on my log file, most positions completed without a problem. So I am confused why when the output file is written I only have a VCF with a header and column line, all of the actual data gathered is not written into the output file. Can someone please help?

Example of tail of output file:

contig=<ID=83384,length=12863>

contig=<ID=83390,length=2699563>

contig=<ID=decoy_9,length=11857114>

reference=file:///path.../

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 101nt.Q25.fq.1.cor.CLCmappedRead_to_LongrangerFASTA

Tail of log file:

INFO 12:16:46,798 ProgressMeter - 83390:2699563 3.20002381E9 2.9 h
3.0 s 99.6% 2.9 h 38.0 s
Using AVX accelerated implementation of PairHMM
INFO 12:16:59,483 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked succe
ssfully from GATK jar file
INFO 12:16:59,485 VectorLoglessPairHMM - Using vectorized implementation of Pai
rHMM
INFO 12:16:59,486 HaplotypeCaller - Ran local assembly on 0 active regions
INFO 12:16:59,592 ProgressMeter - done 3.214580487E9 2.9 h
3.0 s 100.0% 2.9 h 0.0 s
INFO 12:16:59,601 ProgressMeter - Total runtime 10343.33 secs, 172.39 min, 2.87
hours
INFO 12:16:59,605 MicroScheduler - 1545675113 reads were filtered out during th
e traversal out of approximately 1545675113 total reads (100.00%)
INFO 12:16:59,607 MicroScheduler - -> 0 reads (0.00% of total) failing BadCig
arFilter
INFO 12:16:59,609 MicroScheduler - -> 0 reads (0.00% of total) failing FailsV
endorQualityCheckFilter
INFO 12:16:59,610 MicroScheduler - -> 15720430 reads (1.02% of total) failing
HCMappingQualityFilter
INFO 12:16:59,611 MicroScheduler - -> 1529954683 reads (98.98% of total) fail
ing MalformedReadFilter
INFO 12:16:59,612 MicroScheduler - -> 0 reads (0.00% of total) failing Mappin
gQualityUnavailableFilter
INFO 12:16:59,617 MicroScheduler - -> 0 reads (0.00% of total) failing NotPri
maryAlignmentFilter
INFO 12:16:59,620 MicroScheduler - -> 0 reads (0.00% of total) failing Unmapp

edReadFilter


Done. There were 665 WARN messages, the first 10 are repeated below.
WARN 09:24:36,333 InbreedingCoeff - Annotation will not be calculated. Inbreedi
ngCoeff requires at least 10 unrelated samples.
WARN 10:11:19,402 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:4882189, only considering the first
record
WARN 10:11:22,825 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5068369, only considering the first
record
WARN 10:11:30,607 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5371998, only considering the first
record
WARN 10:11:35,391 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5632399, only considering the first
record
WARN 10:11:35,560 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5645151, only considering the first
record
WARN 10:11:35,563 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5645154, only considering the first
record
WARN 10:11:37,884 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5745515, only considering the first
record
WARN 10:11:37,888 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5745516, only considering the first
record
WARN 10:11:37,919 GenotypingGivenAllelesUtils - Multiple valid VCF records dete
cted in the alleles input file at site 1984:5747389, only considering the first

record


Thank you,
Amanda

Best Answer

Answers

  • trevorconleytrevorconley San DiegoMember

    Hi,
    If you notice, 100% of the reads were filtered out, with (98.98% of total) failing MalformedReadFilter, and the rest of them failing the Quality Filter. I'm no expert, but to me it seems more like there's a problem with the BAM file and somewhere in the creation of that rather than HaplotypeCaller being the issue. I don't remember all the specifics, but for me, HC needed a specifically processed BAM file in order to run correctly.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @Amanda
    Hi Amanda,

    @trevorconley is correct. The issue is that all your reads are getting filtered out. What kind of data are you working with and how did you pre-process it? Try running ValidateSamFile on your BAM file.

    Thanks,
    Sheila

  • AmandaAmanda North CarolinaMember

    Hi Sheila,

    I will try running through there, however the file has been processed identically to every other run. Exported BAM from CLC run through the process to ensure correct formation: http://gatkforums.broadinstitute.org/discussion/2909/howto-fix-a-badly-formatted-bam

    Not quite sure what has changed since the last time that I've ran through the exact same process and didn't get any of these errors.

    Thanks,
    Amanda

  • AmandaAmanda North CarolinaMember

    Hi Sheila,

    It appears that NM tags are missing from the original file from CLC. Interestingly their output must have changed over workbench versions. I'm running through picard to SetNmAndUqTags and hopefully that will solve the problem. Do you have any other suggestion with this issue?

    Best,
    Amanda

  • AmandaAmanda North CarolinaMember

    Hi Sheila,

    I am still working on it. SetNmAndUqTags was able to fix the issue found with NM tags. There was one other problem that I was able to fix with base quality.

    Thank you!
    Amanda

Sign In or Register to comment.