The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

CombineVariants in PRIORITIZE mode without -priority

TechnicalVaultTechnicalVault Cambridge, UKMember Posts: 111 ✭✭✭
edited November 2012 in Ask the GATK team

Hi,

I'm just reverse engineering a colleagues script and I've noticed they're using CombineVariants in PRIORITIZE mode but without a -priority argument. I've looked at the documentation and I can't see what the defined behaviour would be in this situation. Would default priority in this situation follow the order of the arguments supplied; the reverse order; or random?

Thanks,
Martin

Edit: Nevermind, from what I can see from the source it should be erroring out if -priority is not supplied. I must have missed something in the pipeline script.

Edit 2:
No wait

    if ( genotypeMergeOption == VariantContextUtils.GenotypeMergeType.PRIORITIZE && PRIORITY_STRING == null )
        throw new UserException.MissingArgument("rod_priority_list", "Priority string must be provided if you want to prioritize genotypes");

is pointless because this is run first in initialize:

    if ( PRIORITY_STRING == null ) {
        PRIORITY_STRING = Utils.join(",", vcfRods.keySet());
        logger.info("Priority string not provided, using arbitrary genotyping order: " + PRIORITY_STRING);
    }

This should follow the input order yes? Unless vcfRods.keySet is sorted?

Post edited by TechnicalVault on

Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute and Genetic Epidemiology Group - WTSI & Cambridge University

Best Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,388 admin
    Accepted Answer

    I see, thanks for the clarification -- I had indeed misunderstood your post. Sure, getting someone else to fix things things for us is always good :)

    I'll ask the appropriate developer to get in touch with you to determine which way to go.

    Geraldine Van der Auwera, PhD

  • amiami Dev Posts: 50
    Accepted Answer

    Hi Martin,

    I fixed this issue and it will be part of the new version (2.3) probably next week.
    In cases where you try you use the PRIORITISE mode and -priority is not specified GATK now emits the proper error message.
    (I also changed some of the related code to make sure that we sort by priory only when it is necessary and that we do take the priority list into account when it is provided, even if it is not a PRIORITISE mode.

    Thanks for pointing up on this problem. Please let us know if you still think that the problem is not solved in the coming new version (GATK2.3).

Answers

  • TechnicalVaultTechnicalVault Cambridge, UKMember Posts: 111 ✭✭✭
    edited November 2012

    Hmm if I read the source code right in your vcfutils class then vcfRods is a HashMap and order is not guaranteed... From the Java docs "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time" Thus order is reliant on what you named your input rods and may change dependant on your java implementation? Am I correct in my reading of this?

    Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute and Genetic Epidemiology Group - WTSI & Cambridge University

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,388 admin

    Hi Martin, unfortunately we don't have the resources right now to provide support for code interpretation and development, sorry!

    Geraldine Van der Auwera, PhD

  • TechnicalVaultTechnicalVault Cambridge, UKMember Posts: 111 ✭✭✭
    edited November 2012

    Hi Geraldine, I think you've misunderstand. When I first asked the question I was asking what would happen, as it was potentially undefined and undocumented behaviour in GATK.

    Then I realised (thus the edits) that this is a bug in GATK. If PRIORITISE mode is set and -priority is not specified GATK should emit the error "Priority string must be provided if you want to prioritize genotypes", it fails to do because the arbitrary genotyping order code kicks in first. So the answer should either be:

    -Existing behaviour will continue and the MissingArgument error code will be deleted.

    -Behaviour will be corrected and the arbitrary genotyping order code will be deleted.

    Of course if you don't have the manpower to fix it, just say which way you want it fixed and I can supply the appropriate patch as a pull request? :)

    Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute and Genetic Epidemiology Group - WTSI & Cambridge University

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,388 admin
    Accepted Answer

    I see, thanks for the clarification -- I had indeed misunderstood your post. Sure, getting someone else to fix things things for us is always good :)

    I'll ask the appropriate developer to get in touch with you to determine which way to go.

    Geraldine Van der Auwera, PhD

  • amiami Dev Posts: 50
    Accepted Answer

    Hi Martin,

    I fixed this issue and it will be part of the new version (2.3) probably next week.
    In cases where you try you use the PRIORITISE mode and -priority is not specified GATK now emits the proper error message.
    (I also changed some of the related code to make sure that we sort by priory only when it is necessary and that we do take the priority list into account when it is provided, even if it is not a PRIORITISE mode.

    Thanks for pointing up on this problem. Please let us know if you still think that the problem is not solved in the coming new version (GATK2.3).

  • TechnicalVaultTechnicalVault Cambridge, UKMember Posts: 111 ✭✭✭

    Thank you

    Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute and Genetic Epidemiology Group - WTSI & Cambridge University

Sign In or Register to comment.