If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
AbstractVCFCodec error when running MuTect?
I just ran a whole group of paired BAM files through mutect and keeping getting warnings from "AbstractVCFCodec". Specifically, as it's processing reads near the beginning of chr2 it reads:
WARN AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 35003, likely resulting in degraded VCF processing performance
With the "approximately line" in the range of 34792-40487 and the reported length always equal. Then in Chr5 it warns (again, the same length for about 20 different line numbers):
WARN bstractVCFCodec - Allele detected with length 1857070 exceeding max size 1048576 at approximately line 100968, likely resulting in degraded VCF processing performance.
The MuTect command I used was:
java -Xmx4g -jar $mutect_dir/muTect-1.1.4.jar -T MuTect \
--reference_sequence hg19.genome.fa --cosmic Cosmic.hg19.vcf --dbsnp dbsnp_138.hg19.vcf \
--intervals Regions.bed -dt NONE -rf BadCigar
--input_file:normal normal.cocleaned.bam --input_file:tumor tumor.cocleaned.bam
--out pair..mutect.out --coverage_file pair.coverage.wig.txt --vcf pair.mutect.vcf
The line numbers and warning messages are the same for every pair that I ran, though oddly, the warning messages do not appear exactly in the same place in the output files (sometimes the first set of warnings were during processing of chr1, sometimes during processing of chr2). My input files are BAMs, which would lead me to suspect that this error comes from the dbsnp and cosmic files, but looking at those VCF files there doesn't seem to be anything odd about those line ranges.
The final VCFs produced by Mutect look reasonable based on a quick skim-though, but the strange warnings make me a nervous. Do you know what might cause these warnings?