Collected FAQs about interval lists

Geraldine_VdAuweraGeraldine_VdAuwera Posts: 9,938Administrator, Dev admin
edited January 2013 in FAQs

1. What file formats do you support for interval lists?

We support three types of interval lists, as mentioned here. Interval lists should preferentially be formatted as Picard-style interval lists, with an explicit sequence dictionary, as this prevents accidental misuse (e.g. hg18 intervals on an hg19 file). Note that this file is 1-based, not 0-based (first position in the genome is position 1).

2. I have two (or more) sequencing experiments with different target intervals. How can I combine them?

One relatively easy way to combine your intervals is to use the online tool Galaxy, using the Get Data -> Upload command to upload your intervals, and the Operate on Genomic Intervals command to compute the intersection or union of your intervals (depending on your needs).

Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD

Tagged:

Comments

  • prepagamprepagam Posts: 39Member

    If I was only interested in calling variants in a set of neutral regions, I wonder if there are any negative implications to intersecting my bam with a bed file of these regions PRIOR to gatk. i.e. doing this rather than using the genomics intervals that GATK offers. For me this is preferable for various storage reasons, but perhaps this has some unknown side effect with GaTK.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 9,938Administrator, Dev admin

    No problem at all, you can use whatever intervals you want. This may influence the expected Ti/Tv ratio, so keep that in mind when you analyze your callset, but it shouldn't have any effect on the quality of results.

    Geraldine Van der Auwera, PhD

  • eflanneryeflannery San DiegoPosts: 9Member

    Hi Geraldine, It seems like there is a minimum size the interval in the interval list needs to be to get outputted in the Diagnose Targets walker. Do you know this minimum? Is it default or calculated each time? Is there a way to change it?

    Thanks!

    Erika

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 9,938Administrator, Dev admin

    Hi @eflannery,

    I just looked at the code and didn't find any hardcoded limits. The only limitation that I'm aware of is that intervals must be non-null (ie not zero-length). Why do you think there's a limit?

    Geraldine Van der Auwera, PhD

  • eflanneryeflannery San DiegoPosts: 9Member

    When I run Diagnose Targets there are intervals that are not present in the output file that are present in the interval_list file. All of the intervals that are excluded, are very small, <500bp. I only assumed this is why they were not included. Shouldn't every interval in interval_list be included in the output of diagnose Targets?

    Thanks!

    Erika

  • SheilaSheila Broad InstitutePosts: 3,195Member, Broadie, Moderator, Dev admin

    @eflannery
    Hi Erika,

    Sorry for the late response. I was going through my old emails and found this! Are you still having an issue with this? Is it possible that the short intervals overlap some other longer intervals and are getting output as part of the longer intervals?

    Thanks,
    Sheila

  • KatieKatie United StatesPosts: 23Member

    Is there a way to define an interval list by position rather than interval? For example, if I am interested in using SelectVariants, can I query a VCF with a list containing only contig and SNP position? I've tried this but seems like I need to define regions rather than positions.
    Thank you!

  • KatieKatie United StatesPosts: 23Member

    Sorry to bother, I found that vcftools will filter with a tab-delimited list of chromosome and position with the command:
    vcftools --vcf 'VCFfile' --positions 'positions_list'

    Cheers,

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 9,938Administrator, Dev admin

    You can do this with SelectVariants, sure. You can pass in single positions using either the interval list format or a vcf of sites of interest.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.