To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

CombineVariants in PRIORITIZE mode without -priority

TechnicalVaultTechnicalVault Cambridge, UKMember
edited November 2012 in Ask the GATK team

Hi,

I'm just reverse engineering a colleagues script and I've noticed they're using CombineVariants in PRIORITIZE mode but without a -priority argument. I've looked at the documentation and I can't see what the defined behaviour would be in this situation. Would default priority in this situation follow the order of the arguments supplied; the reverse order; or random?

Thanks,
Martin

Edit: Nevermind, from what I can see from the source it should be erroring out if -priority is not supplied. I must have missed something in the pipeline script.

Edit 2:
No wait

    if ( genotypeMergeOption == VariantContextUtils.GenotypeMergeType.PRIORITIZE && PRIORITY_STRING == null )
        throw new UserException.MissingArgument("rod_priority_list", "Priority string must be provided if you want to prioritize genotypes");

is pointless because this is run first in initialize:

    if ( PRIORITY_STRING == null ) {
        PRIORITY_STRING = Utils.join(",", vcfRods.keySet());
        logger.info("Priority string not provided, using arbitrary genotyping order: " + PRIORITY_STRING);
    }

This should follow the input order yes? Unless vcfRods.keySet is sorted?

Post edited by TechnicalVault on

Best Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Accepted Answer

    I see, thanks for the clarification -- I had indeed misunderstood your post. Sure, getting someone else to fix things things for us is always good :)

    I'll ask the appropriate developer to get in touch with you to determine which way to go.

  • amiami Member
    Accepted Answer

    Hi Martin,

    I fixed this issue and it will be part of the new version (2.3) probably next week.
    In cases where you try you use the PRIORITISE mode and -priority is not specified GATK now emits the proper error message.
    (I also changed some of the related code to make sure that we sort by priory only when it is necessary and that we do take the priority list into account when it is provided, even if it is not a PRIORITISE mode.

    Thanks for pointing up on this problem. Please let us know if you still think that the problem is not solved in the coming new version (GATK2.3).

Answers

  • TechnicalVaultTechnicalVault Cambridge, UKMember
    edited November 2012

    Hmm if I read the source code right in your vcfutils class then vcfRods is a HashMap and order is not guaranteed... From the Java docs "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time" Thus order is reliant on what you named your input rods and may change dependant on your java implementation? Am I correct in my reading of this?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Martin, unfortunately we don't have the resources right now to provide support for code interpretation and development, sorry!

  • TechnicalVaultTechnicalVault Cambridge, UKMember
    edited November 2012

    Hi Geraldine, I think you've misunderstand. When I first asked the question I was asking what would happen, as it was potentially undefined and undocumented behaviour in GATK.

    Then I realised (thus the edits) that this is a bug in GATK. If PRIORITISE mode is set and -priority is not specified GATK should emit the error "Priority string must be provided if you want to prioritize genotypes", it fails to do because the arbitrary genotyping order code kicks in first. So the answer should either be:

    -Existing behaviour will continue and the MissingArgument error code will be deleted.

    -Behaviour will be corrected and the arbitrary genotyping order code will be deleted.

    Of course if you don't have the manpower to fix it, just say which way you want it fixed and I can supply the appropriate patch as a pull request? :)

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Accepted Answer

    I see, thanks for the clarification -- I had indeed misunderstood your post. Sure, getting someone else to fix things things for us is always good :)

    I'll ask the appropriate developer to get in touch with you to determine which way to go.

  • amiami Member
    Accepted Answer

    Hi Martin,

    I fixed this issue and it will be part of the new version (2.3) probably next week.
    In cases where you try you use the PRIORITISE mode and -priority is not specified GATK now emits the proper error message.
    (I also changed some of the related code to make sure that we sort by priory only when it is necessary and that we do take the priority list into account when it is provided, even if it is not a PRIORITISE mode.

    Thanks for pointing up on this problem. Please let us know if you still think that the problem is not solved in the coming new version (GATK2.3).

  • TechnicalVaultTechnicalVault Cambridge, UKMember
Sign In or Register to comment.