Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

GATK UnifiedGenotyper Missing AD value in genotype

rwnessrwness Posts: 10Member

I have been using GATK (v2.2) UnifiedGenotyper to generate VCFs. I did a multisample realignment around indels which generated a multisample BAM of size ~500Gb. After looking at some of the SNP calls I decided to try removing duplicates. I used MarkDuplicates with "REMOVE_DUPLICATES=true" and although only 10% of reads were duplicates, the BAM reduced to ~75Gb. This did not seem to affect the depth of reads at a site more than the expected ~10% but now the AD field in the genotype columns is missing. ie GT:AD:GQ 0/1:.:30 When I run UnifiedGenotyper with the old BAM prior to MarkDuplicates the AD field is present.

I am currently running the MarkDuplicates on each sample prior to realignment - because I think this makes the most sense, but isn't clear why this should matter,

Any ideas on what is happening here?


Sign In or Register to comment.