Effect of cohort in HaplotypeCaller
I'm trying to measure the effect of the composition of a cohort on the calling of individual sample. So my cohort includes 46 Caucasian and 1 Japanese (J1) exome seq samples from 1KG. I want to check if the Japanese specific variants will be buried in such a cohort.
I'm only using chr22 and the average depth of coverage on all samples are around 10X. The input files to HC are the reduced bam files. Here are the results and my questions:
1). the command:
java -Xmx4g -jar $gatkDir/GenomeAnalysisTK.jar -T HaplotypeCaller \ -R $refGenome \ --dbsnp $dbSNP \ -stand_call_conf 50.0 \ -stand_emit_conf 10.0 \ -o $cohort.raw.var.vcf \ -I $cohort.list
2). While the calls to the Caucasians seem normal, all calls to J1 are "./."
3). Then I run HC with J1 alone, the resulting vcf file only contains headers, no content; i.e., the last line in that file is the following:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT jpt.ERR034603
The HC finished w/o reporting any error. Here is a portion from the output of HC:
INFO 15:13:25,888 MicroScheduler - 136025 reads were filtered out during the traversal out of approximately 2686707 total reads (5.06%) INFO 15:13:25,888 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter INFO 15:13:25,889 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter INFO 15:13:25,889 MicroScheduler - -> 136025 reads (5.06% of total) failing HCMappingQualityFilter INFO 15:13:25,889 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter INFO 15:13:25,890 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter INFO 15:13:25,890 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter INFO 15:13:25,890 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
4). I run HC with one Caucasian alone, the vcf file looks normal.
5). Then I run HC using a cohort including J1 and four other Japanese samples, the calling to J1 seems normal.
Could anyone explain 2. 3, and 4? Should I increase the percentage of Japanese in the Caucasian cohort in order to get calling on J1? Thanks a lot!