The current GATK version is 3.4-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# CombineVariants in PRIORITIZE mode without -priority

Cambridge, UKPosts: 96Member ✭✭
edited November 2012

Hi,

I'm just reverse engineering a colleagues script and I've noticed they're using CombineVariants in PRIORITIZE mode but without a -priority argument. I've looked at the documentation and I can't see what the defined behaviour would be in this situation. Would default priority in this situation follow the order of the arguments supplied; the reverse order; or random?

Thanks,
Martin

Edit: Nevermind, from what I can see from the source it should be erroring out if -priority is not supplied. I must have missed something in the pipeline script.

Edit 2:
No wait

    if ( genotypeMergeOption == VariantContextUtils.GenotypeMergeType.PRIORITIZE && PRIORITY_STRING == null )
throw new UserException.MissingArgument("rod_priority_list", "Priority string must be provided if you want to prioritize genotypes");


is pointless because this is run first in initialize:

    if ( PRIORITY_STRING == null ) {
PRIORITY_STRING = Utils.join(",", vcfRods.keySet());
logger.info("Priority string not provided, using arbitrary genotyping order: " + PRIORITY_STRING);
}


This should follow the input order yes? Unless vcfRods.keySet is sorted?

Post edited by TechnicalVault on

Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute

Tagged:

I see, thanks for the clarification -- I had indeed misunderstood your post. Sure, getting someone else to fix things things for us is always good

I'll ask the appropriate developer to get in touch with you to determine which way to go.

Geraldine Van der Auwera, PhD

• Posts: 45GATK Dev mod

Hi Martin,

I fixed this issue and it will be part of the new version (2.3) probably next week.
In cases where you try you use the PRIORITISE mode and -priority is not specified GATK now emits the proper error message.
(I also changed some of the related code to make sure that we sort by priory only when it is necessary and that we do take the priority list into account when it is provided, even if it is not a PRIORITISE mode.

Thanks for pointing up on this problem. Please let us know if you still think that the problem is not solved in the coming new version (GATK2.3).

• Cambridge, UKPosts: 96Member ✭✭
edited November 2012

Hmm if I read the source code right in your vcfutils class then vcfRods is a HashMap and order is not guaranteed... From the Java docs "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time" Thus order is reliant on what you named your input rods and may change dependant on your java implementation? Am I correct in my reading of this?

Post edited by TechnicalVault on

Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute

Hi Martin, unfortunately we don't have the resources right now to provide support for code interpretation and development, sorry!

Geraldine Van der Auwera, PhD

• Cambridge, UKPosts: 96Member ✭✭
edited November 2012

Hi Geraldine, I think you've misunderstand. When I first asked the question I was asking what would happen, as it was potentially undefined and undocumented behaviour in GATK.

Then I realised (thus the edits) that this is a bug in GATK. If PRIORITISE mode is set and -priority is not specified GATK should emit the error "Priority string must be provided if you want to prioritize genotypes", it fails to do because the arbitrary genotyping order code kicks in first. So the answer should either be:

-Existing behaviour will continue and the MissingArgument error code will be deleted.

-Behaviour will be corrected and the arbitrary genotyping order code will be deleted.

Of course if you don't have the manpower to fix it, just say which way you want it fixed and I can supply the appropriate patch as a pull request?

Post edited by TechnicalVault on

Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute

I see, thanks for the clarification -- I had indeed misunderstood your post. Sure, getting someone else to fix things things for us is always good

I'll ask the appropriate developer to get in touch with you to determine which way to go.

Geraldine Van der Auwera, PhD

• Posts: 45GATK Dev mod

Hi Martin,

I fixed this issue and it will be part of the new version (2.3) probably next week.
In cases where you try you use the PRIORITISE mode and -priority is not specified GATK now emits the proper error message.
(I also changed some of the related code to make sure that we sort by priory only when it is necessary and that we do take the priority list into account when it is provided, even if it is not a PRIORITISE mode.

Thanks for pointing up on this problem. Please let us know if you still think that the problem is not solved in the coming new version (GATK2.3).

• Cambridge, UKPosts: 96Member ✭✭

Thank you

Martin Pollard, Human Genetics Informatics - Wellcome Trust Sanger Institute