This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
GATK UnifiedGenotyper Missing AD value in genotype
I have been using GATK (v2.2) UnifiedGenotyper to generate VCFs.
I did a multisample realignment around indels which generated a multisample BAM of size ~500Gb. After looking at some of the SNP calls I decided to try removing duplicates. I used MarkDuplicates with "REMOVE_DUPLICATES=true" and although only 10% of reads were duplicates, the BAM reduced to ~75Gb. This did not seem to affect the depth of reads at a site more than the expected ~10% but now the AD field in the genotype columns is missing. ie GT:AD:GQ 0/1:.:30
When I run UnifiedGenotyper with the old BAM prior to MarkDuplicates the AD field is present.
I am currently running the MarkDuplicates on each sample prior to realignment - because I think this makes the most sense, but isn't clear why this should matter,
Any ideas on what is happening here?