I have tumor only calls with Mutect and would like to filter them further. My first question is, if CNNScoreVariants applicable to this? If so, I can't get it work. The command line I am using (tried both 1D and 2D models) (also tried different batch sizes in 2D model):
/GenomeAnalysisTk/ CNNScoreVariants -V /path/to/input_somatic_twicefiltered.vcf.gz -R /ref/hg19/Homo_sapiens_assembly19.fasta -O /path/to/output_annotated.vcf --intra-op-threads 20
However I am always stuck at this level and no progress (for days)
13:00:07.286 INFO CNNScoreVariants - Initializing engine 13:00:14.027 INFO FeatureManager - Using codec VCFCodec to read file 13:00:14.661 INFO CNNScoreVariants - Done initializing engine
I am only providing 1 sample here. Do I supposed to provide a batch? Or what else I could be doing wrong?


  • Thank you @samwell for your answer. If I understand you correctly I shouldn't try CNNScoreVar for somatic mutations but you wanted to help me troubleshoot just in case? Thanks a lot!
    I am running GATK4 in a HPC environment where conda is not used at all (so I did not clone the repo). If I knew all the necessary dependencies, I could ask IT to install them in the HPC environment (to use when the tool is trained with somatic mutations) Thanks for guidance.

