The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

haplotypecaller and recalibration relations when workin on rough reference genome

pawel_osipowskipawel_osipowski Warsaw, PolandMember Posts: 8

Hi,

Would it matter a lot to do recalibration before usage of hc? If yes, it's worth to do realignment before recalibration to reduce no of snv's, I guess. I'm searching for variants in one of not too much known genomes. What I've got is 30x coverage and haploids I work on. I did realignment and recalibration but I'm thinking if it helps me in anything while I don't know any obvious snp sites. And if it's not better to use haplotypecaller stright away after mapping and then do the recalibration with a bunch of best quality variants?

Please, help in that matter.

Paul

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,662 admin

    Hi Paul,

    For realignment, you don't need to have a set of known variants. The program is able to identify regions that need to be realigned and do so without known variants. For recalibration, known variants are required, so if you don't have a set available you can generate one yourself from your data. You'll need to do a first round of calling (without recalibration), choose the highest-confidence variants, then try recalibrating with those. You may need to do several cycles to refine your set of known variants, for best results.Good luck!

    Geraldine Van der Auwera, PhD

  • pawel_osipowskipawel_osipowski Warsaw, PolandMember Posts: 8

    Hi Geraldine,

    Thank you for your feedback. I like your software more and more! But I've got a feeling that you didn't understand me correctly. I assume my written english is far from clear. I'll try to put it in other words. Is it sensible to use haplotype caller before recalibration opposed to standard pipeline. Off course I could 'traditionally' use realigner on my raw data, than call my SNVs, do the recab (with highest-confidence data) and perform realignment. But I'm kind of curious if instead of calling variants traditionally and picking up the highest-confidence SNVs I could call them by hyplotype caller stright away on raw data (and use it twice per pipeline this way) as it's better in calling highest-confidence SNVs than anything else?

  • pawel_osipowskipawel_osipowski Warsaw, PolandMember Posts: 8

    Yes, I'm thinking of skipping realignment and recalibration. And using haplotype caller to get highest-confidence data to do recalibration. Forget about "use it twice per pipeline this way" :)

  • pawel_osipowskipawel_osipowski Warsaw, PolandMember Posts: 8

    I've had second scenario on my mind - thanks a lot! Sorry for imprecise writing from the beginning. We could deal with that quicker. I'm taking it into account.

  • pawel_osipowskipawel_osipowski Warsaw, PolandMember Posts: 8

    Dear Geraldine,

    Could you just explain me what do I need the second step (realign using high-conf variants) for?

    Paul

Sign In or Register to comment.