Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

GATK Staff

Does UnifiedGenotyper use the first rs it finds at a given position ?

lindenblindenb FranceMember ✭✭

Hi the GATK team;

I use the UnifiedGenotyper the following way:

java -jar GenomeAnalysisTK-2.1-13-g1706365/GenomeAnalysisTK.jar \
        -R /human_g1k_v37.fasta \
        -T UnifiedGenotyper \
        -glm BOTH \
        -S SILENT \
         -L ../align/capture.bed  \
         -I  myl.bam  \
        --dbsnp broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf.gz \
        -o output.vcf 

When I look at the generated VCF , the variation 18:55997929 (CTTCT/C) is said to be rs4149608

18 55997929 rs4149608 CTTCT C (...)

but in the dbsnp_135.b37.vcf.gz, you can see that the right rs## should be rs144384654

$ gunzip -c broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf.gz |grep -E -w '(rs4149608|rs144384654)'
18 55997929 rs4149608 CT C,CTTCT (...)
18 55997929 rs144384654 CTTCT C (...)

does UnifiedGenotyper uses the first rs## it finds at a given position ? Or should I use another method/tool to get the 'right' rs## ?

Thank you,


Best Answer


Sign In or Register to comment.