error: use BaseRecalibrator do BQSR

I try to use BaseRecalibrator to generate a .grp file or a .table file, so I could do variant call after that
my input file is bam file which has been sort/add header/Duplicates Marking/index/Local realignment around indels
but I get a empty output file. I don't know what error happened
follow is the reporting information on my terminal, I search for help cause I could not solve the problem and I have tried many times
=,=
attached file is the output file(sad, the uploaded file types is not allow.)

[email protected]~/alignment/try$java -jar ~/biosoft/GATK/GenomeAnalysisTK.jar
-T BaseRecalibrator
-R ~/biosoft/GATK/resource_bundle/hg19/ucsc.hg19.fasta
-I sort_addhead_rodu_realn.bam
-knownSites ~/biosoft/GATK/resource_bundle/hg19/dbsnp_138.hg19.vcf
-knownSites ~/biosoft/GATK/resource_bundle/hg19/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
-knownSites ~/biosoft/GATK/resource_bundle/hg19/1000G_phase1.indels.hg19.sites.vcf
-o recal.table

INFO 21:51:12,487 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:51:12,491 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
INFO 21:51:12,491 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO 21:51:12,491 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
INFO 21:51:12,492 HelpFormatter - [Tue Mar 07 21:51:12 CST 2017] Executing on Linux 3.10.0-229.el7.x86_64 amd64
INFO 21:51:12,492 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13
INFO 21:51:12,497 HelpFormatter - Program Args: -T BaseRecalibrator -R /home/sxj/biosoft/GATK/resource_bundle/hg19/ucsc.hg19.fasta -I sort_addhead_rodu_realn.bam -knownSites /home/sxj/biosoft/GATK/resource_bundle/hg19/dbsnp_138.hg19.vcf -knownSites /home/sxj/biosoft/GATK/resource_bundle/hg19/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -knownSites /home/sxj/biosoft/GATK/resource_bundle/hg19/1000G_phase1.indels.hg19.sites.vcf -o recal.table
INFO 21:51:12,503 HelpFormatter - Executing as [email protected] on Linux 3.10.0-229.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13.
INFO 21:51:12,503 HelpFormatter - Date/Time: 2017/03/07 21:51:12
INFO 21:51:12,504 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:51:12,504 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:51:12,528 GenomeAnalysisEngine - Strictness is SILENT
INFO 21:51:12,760 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 21:51:12,774 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 21:51:12,814 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04
INFO 21:51:13,524 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 21:51:13,535 GenomeAnalysisEngine - Done preparing for traversal
INFO 21:51:13,537 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 21:51:13,538 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 21:51:13,539 ProgressMeter - Location | reads | elapsed | reads | completed | runtime | runtime
INFO 21:51:13,590 BaseRecalibrator - The covariates being used here:
INFO 21:51:13,591 BaseRecalibrator - ReadGroupCovariate
INFO 21:51:13,592 BaseRecalibrator - QualityScoreCovariate
INFO 21:51:13,592 BaseRecalibrator - ContextCovariate
INFO 21:51:13,593 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
INFO 21:51:13,593 BaseRecalibrator - CycleCovariate
INFO 21:51:13,597 ReadShardBalancer$1 - Loading BAM index data
INFO 21:51:13,598 ReadShardBalancer$1 - Done loading BAM index data
INFO 21:51:43,849 ProgressMeter - Starting 0.0 30.0 s 50.1 w 100.0% 30.0 s 0.0 s
INFO 21:52:13,853 ProgressMeter - Starting 0.0 60.0 s 99.7 w 100.0% 60.0 s 0.0 s
INFO 21:52:43,855 ProgressMeter - Starting 0.0 90.0 s 149.3 w 100.0% 90.0 s 0.0 s
INFO 21:53:13,866 ProgressMeter - Starting 0.0 120.0 s 199.0 w 100.0% 120.0 s 0.0 s
INFO 21:53:43,867 ProgressMeter - Starting 0.0 2.5 m 248.6 w 100.0% 2.5 m 0.0 s
INFO 21:54:13,869 ProgressMeter - Starting 0.0 3.0 m 298.2 w 100.0% 3.0 m 0.0 s
INFO 21:54:43,871 ProgressMeter - Starting 0.0 3.5 m 347.8 w 100.0% 3.5 m 0.0 s
INFO 21:54:44,965 BaseRecalibrator - Calculating quantized quality scores...
INFO 21:54:44,987 BaseRecalibrator - Writing recalibration report...
INFO 21:54:45,034 BaseRecalibrator - ...done!
INFO 21:54:45,035 BaseRecalibrator - BaseRecalibrator was able to recalibrate 0 reads
INFO 21:54:45,046 ProgressMeter - done 0.0 3.5 m 349.7 w 100.0% 3.5 m 0.0 s
INFO 21:54:45,047 ProgressMeter - Total runtime 211.51 secs, 3.53 min, 0.06 hours
INFO 21:54:45,047 MicroScheduler - 49562318 reads were filtered out during the traversal out of approximately 49562318 total reads (100.00%)
INFO 21:54:45,048 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 21:54:45,048 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
INFO 21:54:45,048 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 21:54:45,057 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 21:54:45,058 MicroScheduler - -> 42382982 reads (85.51% of total) failing MappingQualityUnavailableFilter
INFO 21:54:45,058 MicroScheduler - -> 7179336 reads (14.49% of total) failing MappingQualityZeroFilter
INFO 21:54:45,059 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter

INFO 21:54:45,059 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter

Done. There were no warn messages.

Best Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Accepted Answer

    These lines in the output tell you what is wrong:

    INFO  21:54:45,047 MicroScheduler - 49562318 reads were filtered out during the traversal out of approximately 49562318 total reads (100.00%)
    ...
    INFO 21:54:45,058 MicroScheduler - -> 42382982 reads (85.51% of total) failing MappingQualityUnavailableFilter
    INFO 21:54:45,058 MicroScheduler - -> 7179336 reads (14.49% of total) failing MappingQualityZeroFilter
    

    Your data either doesn't have mapping qualities or mapping quality is =0.

    You should check what happened in earlier steps of processing.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Accepted Answer

    These lines in the output tell you what is wrong:

    INFO  21:54:45,047 MicroScheduler - 49562318 reads were filtered out during the traversal out of approximately 49562318 total reads (100.00%)
    ...
    INFO 21:54:45,058 MicroScheduler - -> 42382982 reads (85.51% of total) failing MappingQualityUnavailableFilter
    INFO 21:54:45,058 MicroScheduler - -> 7179336 reads (14.49% of total) failing MappingQualityZeroFilter
    

    Your data either doesn't have mapping qualities or mapping quality is =0.

    You should check what happened in earlier steps of processing.

  • ChevyChevy chinaMember
    edited March 2017

    @Geraldine_VdAuwera said:
    These lines in the output tell you what is wrong:

    INFO  21:54:45,047 MicroScheduler - 49562318 reads were filtered out during the traversal out of approximately 49562318 total reads (100.00%)
    ...
    INFO 21:54:45,058 MicroScheduler - -> 42382982 reads (85.51% of total) failing MappingQualityUnavailableFilter
    INFO 21:54:45,058 MicroScheduler - -> 7179336 reads (14.49% of total) failing MappingQualityZeroFilter
    

    Your data either doesn't have mapping qualities or mapping quality is =0.

    You should check what happened in earlier steps of processing.

    Thanks, I checked the mapping quality of my raw sam file, find that I used the -k parameter when I mapping the raw fastq file by using bowtie2 so that "report up to alns per read; MAPQ not meaningful", I got the sam file with all 255 MAPQ which has no substantial role.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Oh, if you're running on RNAseq you need to do an additional processing steps, it's in the docs.
  • ChevyChevy chinaMember

    @Geraldine_VdAuwera said:
    Oh, if you're running on RNAseq you need to do an additional processing steps, it's in the docs.

    sad, I've got an another error for help. I am a fish in bioinformatics, sorry for bothering you with my little affairs.
    I've got a output.g.vcf file follow the practice using the haplotypecaller, the file is 287 MB, and the next step is merging the .g.vcf fiile to .vcf file, I try to test this step by using the output.g.vcf file, while I got a 285 KB file. =,= with this warning message:

    WARN 16:20:47,536 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    WARN 16:20:47,537 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples.
    WARN 16:20:47,537 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    WARN 16:20:47,615 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs

    my data is chip-seq data, I want to call variants by following the pipeline showed in the gatk website:best practices.
    I checked the forum but found noting helpful, so I turn to you for help. can you figure out what's wrong here?
    I appreciate your helping me

Sign In or Register to comment.