Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

The AnnotatePairOrientation tool and non-Mutect2 VCF input - only "0,0:0,0" values in the output

Hello!

I would like to annotate VCF files (not generated by Mutect2) with F1R2/F2R1 read support counts. I was happy to find out that a dedicated tool is now provided for this very purpose (AnnotatePairOrientation), but even though I have tried the tool in multiple versions of GATK 4 (ranging from 4.0.0.0 to 4.0.11.0) and never received an error, I am still left out without useful output: for all the annotated variants, the reported F1R2:F2R1values are "0,0:0,0". I do not expect the input BAM and VCF files to be at fault, because Mutect2 running on the same inputs (with the option "--genotyping-mode" set to "GENOTYPE_GIVEN_ALLELES") reports non-zero F1R2/F2R1 counts. The tool's documentation page doesn't seem to list any specific requirements regarding the input VCF files (anything that could, let's say, cause the tool to output default zero values due to some information missing from the INFO or FORMAT fields).
Is any good soul out there able to confirm that the tool either a) works as intended on their data, or b) only outputs zeroes for them as well?

With many thanks and best regards,
Daniel

Best Answer

Answers

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    Hello @danielvo This tool is currently in beta, so may not be working perfectly yet. However, if you want to send in some example files after the break, we can revisit this. We will be back on January 2nd.

  • danielvodanielvo Member

    Hello @AdelaideR and thank you for your answer/offer!

    I can try to create small non-sensitive (artificial) example files that behave as I have initially described and share them with you.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @danielvo Okay, please send them. You should be able to make that happen by messaging me on the forum. If the files are too large, I can send instructions on how to upload to our ftp server.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @danielvo Did you figure out the question on your own? If not, we have a GATK developer willing to take a look at your files so we can determine what might be causing this.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    I am closing the ticket for now, but feel free to follow up again.

  • danielvodanielvo Member

    Thank you again, @AdelaideR!

    I have finally found an opportunity to create some example files (I am sending them via a message now). The example is based on an artificial BAM file covering a single exon (with one prominent variant), but it should illustrate everything that I mentioned in the initial post.
    If you can spend some time on investigating my case, I would be very happy for feedback/insights.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @danielvo Thank you for the information. I am in a workshop all day today, but I hope to go over this example to see how I can improve the documentation for this.

  • danielvodanielvo Member

    Hello @shlee,

    Random sample names in the input VCF files is something I have not considered while trying to make the tool work on my own..

    I can now confirm that after changing the sample name in the VCF (and indexing the VCF again), everything works as expected.

    Thank you all very much for the help!

Sign In or Register to comment.