Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Select INDELs using an interval file

Hi, I used SelectVariants (GATK 4.0) to extract INDELs by providing the start positions of the desired loci using the -L option and an interval file wit start positions in GATK format. However, the tool extracts desired loci plus extra INDELs that I did not specify. The tool clearly selects out the INDELs that I desire but why are these extra loci selected as well?

Answers

  • roshabeyroshabey Member

    Following is the format for intervals that I specified.

    Chr01:14736
    Chr01:18598
    Chr01:18684
    Chr01:44409
    Chr01:44636
    Chr01:44683
    Chr01:45107
    Chr01:47832
    Chr01:49529
    Chr01:49532
    Chr01:51390
    Chr01:71288
    Chr01:72934
    Chr01:73022
    Chr01:77479
    Chr01:139798
    Chr01:140125
    Chr01:165932
    Chr01:171305
    Chr01:172306

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    What is your select command and are you using some kind of interval padding?

  • roshabeyroshabey Member

    gatk --java-options "-Xmx30g" SelectVariants \
    -R ${ref} \
    -L ${PBS_O_WORKDIR}/final_data_set_per_GATK_format_sorted_INDELs.intervals \
    -V ${inputPath}/combined_contigs_genotype_hardfiltered_biallelic_INDELs.vcf \
    -O ${outputPath}/INDEL_truthdataset.vcf \
    --select-type-to-include INDEL \
    --restrict-alleles-to BIALLELIC \
    --exclude-filtered

  • bshifawbshifaw Member, Broadie, Moderator admin

    Would you mind providing an example of indels within at the specified interval but shouldn't have been outputted. So a snippet of your input and output data.
    Was the interval format post from earlier the entire content of the interval file?
    What reference file are you using? Does it contain the Chr01 format?

Sign In or Register to comment.