SelectVariants Starts Traversal but Does not Progress, High CPU Usage


I am using the GATK tool SelectVariants to only select variants that have passed FilterMutectCalls. Both FilterMutectCalls and Mutect2 were run in multi-sample mode, so the VCF being input to SelectVariants is multisample. The GATK version is, HTSJDK Version 2.18.2, and Picard Version: 2.18.25.

The issue is the SelectVariants tool begins the traversal of the VCF, but does not make any progress. The log file stalls out as shown:
09:52:57.790 INFO SelectVariants - Done initializing engine 09:52:57.967 INFO ProgressMeter - Starting traversal 09:52:57.967 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute

This is the last line logged at which point the CPU usage jumps precipitously. I have alloted 10G of memory for the job and 1 hour of run-time, but there is till no progress.

The exact command I am using is:
$gatk_launcher --java-options -Xmx${mem}g SelectVariants \ -R $reference \ --variant $input_fn \ --output $output_fn \ --exclude-filtered true &>> $log_file

I ran ValidateVariants on the VCF in question, and all records validated. The size of the VCF is 18.7M, although I have been able to run the same command succesfully on larger multi-sample VCFs with the same command.

I have attached a subset of the records from the VCF that failed.

For now, I have circumvented the issue with a grep, but I figured I would point it out nonetheless.

Thank you!


