To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Does UnifiedGenotyper use the first rs it finds at a given position ?

Hi the GATK team;

I use the UnifiedGenotyper the following way:

java -jar GenomeAnalysisTK-2.1-13-g1706365/GenomeAnalysisTK.jar \
        -R /human_g1k_v37.fasta \
        -T UnifiedGenotyper \
        -glm BOTH \
        -S SILENT \
         -L ../align/capture.bed  \
         -I  myl.bam  \
        --dbsnp broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf.gz \
        -o output.vcf 

When I look at the generated VCF , the variation 18:55997929 (CTTCT/C) is said to be rs4149608

18 55997929 rs4149608 CTTCT C (...)

but in the dbsnp_135.b37.vcf.gz, you can see that the right rs## should be rs144384654

$ gunzip -c broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf.gz |grep -E -w '(rs4149608|rs144384654)'
18 55997929 rs4149608 CT C,CTTCT (...)
18 55997929 rs144384654 CTTCT C (...)

does UnifiedGenotyper uses the first rs## it finds at a given position ? Or should I use another method/tool to get the 'right' rs## ?

Thank you,

Pierre

Best Answer

Answers

Sign In or Register to comment.