GenotypeGVCFs problem with rsID

mahyarheymahyarhey BostonMember

I run the following command for "GenotypeGVCFs" for 3 VCF files output of HaplotypeCaller as below:

java data/GenomeAnalysisTK-3.2-2/GenomeAnalysisTK.jar \
-R data/ucsc.hg19.fasta \
-T GenotypeGVCFs \
--variant data/47V_post.ERC.vcf \
--variant data/48V_post.ERC.vcf \
--variant data/49V_post.ERC.vcf \
--out data/Combined_geno_3files.vcf

but in a final VCF output there is no rsID information and all rows are "."
what is the problem? I am really confused. Could you please advise how to get SNP-ID in the output VCF

Thanks

Best Answer

Answers

  • KurtKurt Member ✭✭✭

    I typically use VariantAnnotator after GenotypeGVCFs to do that

    e.g.

    $JAVA_1_7/java -jar $GATK_DIR/GenomeAnalysisTK.jar \
    -T VariantAnnotator \
    -R $REF_GENOME \
    --variant $CORE_PATH/$PROJECT/TEMP/out.vcf \
    --dbsnp $DBSNP \
    -L $CORE_PATH/$PROJECT/TEMP/out.vcf \
    -A GCContent \
    -A VariantType \
    --disable_auto_index_creation_and_locking_when_reading_rods \
    -o $CORE_PATH/$PROJECT/MULTI_SAMPLE/new.vcf

  • mahyarheymahyarhey BostonMember

    Thanks Kurt for your prompt reply.
    Do you know why this command "GenotypeGVCFs" can't annotate rs-ID in the final VCF? It is absolutely waste of time to run another script just to annotate SNP's rsID.
    Thanks

  • KurtKurt Member ✭✭✭

    @mahyarhey,

    actually there is a --dbsnp flag available in GenotypeGVCFs, so that might do what you want. I think when I originally implemented this in my workflows it wasn't available and just added on VariantAnnotator afterwards. I never found it to be cumbersome.

  • mahyarheymahyarhey BostonMember

    Thanks both Kurt and Geraldine for informative answer!

Sign In or Register to comment.