Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Reference confidence model for processing multiple samples with low coverage

CeciliaCecilia Melbourne, AustraliaMember

Hi there,

I am using the Reference confidence model for processing multiple samples with low coverage. Is there a way of specifying a minimum and a maximum depth per read? My script looks like this:

java -jar $gatk_dir/GenomeAnalysisTK.jar -T HaplotypeCaller -R ref_loci.fa -I $i."sorted.bam" -stand_call_conf 30 -stand_emit_conf 5 -o ./$i".g.vcf" -ERC GVCF --variant_index_type LINEAR --variant_index_parameter 128000

I included the -stand_call_conf and the -stand_emit_conf, but it doesn't look like it makes a difference with or without those parameters. Is there another way of specifying depth range?

Regards,

Cecilia

Best Answers

Answers

  • CeciliaCecilia Melbourne, AustraliaMember

    Hi Geraldine, thanks for getting back about this. I am a bit confused because on the report that I get when I run the script it says that the target coverage is 250:

    "INFO 17:28:22,100 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250"

    Does it mean that it reads with a coverage lower than 250 will be discarded?

    Also, is there a way of changing the mapping quality of the reads? I think the default is 20, but I would like to set it to 30:

    "INFO 17:28:22,208 HCMappingQualityFilter - Filtering out reads with MAPQ < 20"

    Regards,

    Cecilia

Sign In or Register to comment.