Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
not all sites emitted with GENOTYPE_GIVEN_ALLELES
I am running HC3.3-0 with the following options (e.g. GENOTYPE_GIVEN_ALLELES):
$java7 -Djava.io.tmpdir=tmp -Xmx3900m \ -jar $jar \ --analysis_type HaplotypeCaller \ --reference_sequence $ref \ --input_file $BAM \ --intervals $CHROM \ --dbsnp $dbSNP \ --out $out \ -stand_call_conf 0 \ -stand_emit_conf 0 \ -A Coverage -A FisherStrand -A HaplotypeScore -A MappingQualityRankSumTest -A QualByDepth -A RMSMappingQuality -A ReadPosRankSumTest \ -L $allelesVCF \ -L 20:60000-70000 \ --interval_set_rule INTERSECTION \ --genotyping_mode GENOTYPE_GIVEN_ALLELES \ --alleles $allelesVCF \ --emitRefConfidence NONE \ --output_mode EMIT_ALL_SITES \
The file $allelesVCF contains these neighbouring SNPs:
20 60807 . C T 118.96 . 20 60808 . G A 46.95 . 20 61270 . A C 2870.18 . 20 61271 . T A 233.60 .
I am unable to call these neighbouring SNPs; despite reads being present in the file $BAM, which shouldn't matter anyway. I also tried adding --interval_merging OVERLAPPING_ONLY to the command line, but that didn't solve the problem. What am I doing wrong? I should probably add GATK breaker/misuser to my CV...
Thank you as always.
P.S. The CommandLineGATK documentation does not say, what the default value for --interval_merging is.
P.P.S. Iterative testing a bit slow, because HC always has to do this step:
HCMappingQualityFilter - Filtering out reads with MAPQ < 20