GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

After using Gatk for realignment + recalibration, then samtools for calling : what step is missing ?

Christine31Christine31 Posts: 2Member
edited January 2013 in Ask the GATK team

Hello,

I am new at using GATK (v 2.1-3). I do exome sequencing by sample using the following steps:
Alignment with BWA (0.6.2)
GATK :Local realignment around INDELs
PICARD (1.67): FixMateInformation
GATK: Recalibration (BaseRecalibrator + PrintReads -BQSR)
Samtools for calling variants

Samtools seems to run properly but no file (*.vcf and *.bcf) are created and no error message is prompted :

cd Sample_09
+ samtools mpileup -BE -ug -q 20 -Q 20 -D -f human_g1k_v37.fasta realigned_fixed_recal.bam -C50
+ bcftools view -bvcg -
[mpileup] 1 samples in 1 input files
Set max per-file depth to 8000
[bcfview] 100000 sites processed.
[afs] 0:89274.054 1:6411.053 2:4314.893
[bcfview] 200000 sites processed.
[afs] 0:89100.642 1:6125.883 2:4773.474
[bcfview] 300000 sites processed.
[afs] 0:87374.996 1:7439.238 2:5185.766
[bcfview] 400000 sites processed.
[afs] 0:87890.186 1:7087.628 2:5022.185
[bcfview] 500000 sites processed.
[afs] 0:85322.061 1:8454.843 2:6223.096
[bcfview] 600000 sites processed.
[afs] 0:85864.368 1:8410.777 2:5724.854
[bcfview] 700000 sites processed.
[afs] 0:88813.814 1:6828.001 2:4358.185
[bcfview] 800000 sites processed.
[afs] 0:89070.318 1:6302.924 2:4626.758
[bcfview] 900000 sites processed.
[afs] 0:88364.380 1:6796.962 2:4838.658
[bcfview] 1000000 sites processed.
[afs] 0:86892.531 1:7268.088 2:5839.381
[bcfview] 1100000 sites processed.
[afs] 0:87184.845 1:7153.073 2:5662.081
[bcfview] 1200000 sites processed.
[afs] 0:86762.756 1:7241.236 2:5996.008
[bcfview] 1300000 sites processed.
[afs] 0:89346.143 1:6159.989 2:4493.868
[bcfview] 1400000 sites processed.
[afs] 0:88658.355 1:7160.555 2:4181.089
[bcfview] 1500000 sites processed.
[afs] 0:85985.305 1:8308.039 2:5706.656
[bcfview] 1600000 sites processed.
[afs] 0:87346.636 1:7708.883 2:4944.480
[afs] 0:63097.202 1:3950.127 2:3572.670
+ bcftools view .bcf

+ cd ..

I have seen that some groups use after realignment Picard:AddOrReplaceReadGroups and I wonder if I should use before calling the variant with samtools.

Thanks in advance for any advice you can give me.

Chris

Post edited by Geraldine_VdAuwera on

Best Answer

  • Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Dev admin
    Answer ✓

    Hi Christine31, you should contact the samtools mailing list for help with that tool.

    --
    Mark A. DePristo, Ph.D.
    Co-Director, Medical and Population Genetics
    Broad Institute of MIT and Harvard

Answers

  • Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Dev admin
    Answer ✓

    Hi Christine31, you should contact the samtools mailing list for help with that tool.

    --
    Mark A. DePristo, Ph.D.
    Co-Director, Medical and Population Genetics
    Broad Institute of MIT and Harvard

  • Christine31Christine31 Posts: 2Member

    High Mark,

    I forgot to precise in my previous mail, that my pipeline [BWA+Samtools] was previously running ok, it is when I modified it by adding the 2 new steps [realignement around indels + score recalibration] with GATK that the calling with Samtools seems to work but does not generate any result file...

    I also tried/want to use GATK for the calling of variant step (instead of samtools), but I then receive error message :
    command line:
    java -Xmx12g -jar GenomeAnalysisTK.jar -glm BOTH -T UnifiedGenotyper -R GenomeDeReference/hg19/human_g1k_v37.fasta -I chr22_RFR.bam -D hg19_snp132.txt -o chr22_snps.vcf -metrics snps.metrics -stand_call_conf 50.0 -stand_emit_conf 10.0 -dcov 1000 -A DepthOfCoverage -A AlleleBalance -L ILLUMINA.bed

    Message following:

    ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
    ERROR Name FeatureType Documentation
    ERROR BCF2 VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_bcf2_BCF2Codec.html
    ERROR VCF VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_vcf_VCFCodec.html
    ERROR VCF3 VariantContext http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_utils_codecs_vcf_VCF3Codec.html
  • Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Dev admin

    Unfortunately these are all considered "User Errors", which we don't have the resources to provide support for everyone to work through. We do wish you the best in figuring this out on your own.

    --
    Mark A. DePristo, Ph.D.
    Co-Director, Medical and Population Genetics
    Broad Institute of MIT and Harvard

Sign In or Register to comment.