We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

SNPs from genome-2-genome alignment


I have two strains from small eukariote (say S1 and S2) plus and two reference genomes: G1 (closest to S1 and S2) and sister species G2.
Using GATK I can call SNPs in S1 and S2 genomic data. Since G1 and G2 are quite close, I want to get G2 SNPs after performing G2 to G1 alignment. I got MAF file, reduced it (= removed weaker mappings of the same contig to the other part of the genome), then managed to create SAM and sorted BAM file. I used picard to add fake ReadGroups and "MarkDuplicates". In the end I am running:

java -Xmx240G -jar ~/soft/GATK_current/GenomeAnalysisTK.jar -T UnifiedGenotyper -R G1.fa \
-I G1_vs_G2.reduced.gr.dup.bam \
-o G1_vs_G2.reduced.gr.dup.gatk.vcf

I got no running errors, but apart from VCF header the file is empty.

Is there any way to pass some argument to UnifiedGenotyper so it will ignore coverage, and simply call every SNP it encounters?

Many thanks,

Darek Kedra


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    You mean that you're passing the "flat" G2 genome as a bam file (presumably containing one huge "read" per chromosome?) to call variants against G1? I'm not sure it's possible to do this, as this is very different from what UG was designed for. If it was me I would use a genome aligner like Mauve to identify the divergent regions between the two references and see where the SNPs map to.

Sign In or Register to comment.