IllegalArgumentException: samples cannot be empty

I am trying to run HaplotypeCaller on some data that I know is messy and would fail some of the filters, so I have run it both with and without --disableToolDefaultReadFilters. Either way I don't get any output file, but I do get a message "samples cannot be empty", Does this mean that my data is still failing some built-in control, or am I doing something else wrong? I have checked the SQ, and when I run CountReads (with --disableToolDefaultReadFilters) it results in "Tool returned: 24634".

Here's my command:

$ java -jar ~/Downloads/gatk-4.beta.2/gatk-package-4.beta.2-local.jar HaplotypeCaller -R DQA_contig.fasta -ploidy 50 -I IRL-A.bam.sorted.bam -O IRL-A.vcf --disableToolDefaultReadFilters
13:40:49.007 WARN IntelGKLUtils - Error starting process to check for AVX support : grep -i avx /proc/cpuinfo
13:40:49.014 WARN IntelGKLUtils - Error starting process to check for AVX support : grep -i avx /proc/cpuinfo
[July 24, 2017 1:40:48 PM EDT] HaplotypeCaller --sample_ploidy 50 --output IRL-A.vcf --input IRL-A.bam.sorted.bam --reference DQA_contig.fasta --disableToolDefaultReadFilters true --group StandardAnnotation --group StandardHCAnnotation --GVCFGQBands 1 --GVCFGQBands 2 --GVCFGQBands 3 --GVCFGQBands 4 --GVCFGQBands 5 --GVCFGQBands 6 --GVCFGQBands 7 --GVCFGQBands 8 --GVCFGQBands 9 --GVCFGQBands 10 --GVCFGQBands 11 --GVCFGQBands 12 --GVCFGQBands 13 --GVCFGQBands 14 --GVCFGQBands 15 --GVCFGQBands 16 --GVCFGQBands 17 --GVCFGQBands 18 --GVCFGQBands 19 --GVCFGQBands 20 --GVCFGQBands 21 --GVCFGQBands 22 --GVCFGQBands 23 --GVCFGQBands 24 --GVCFGQBands 25 --GVCFGQBands 26 --GVCFGQBands 27 --GVCFGQBands 28 --GVCFGQBands 29 --GVCFGQBands 30 --GVCFGQBands 31 --GVCFGQBands 32 --GVCFGQBands 33 --GVCFGQBands 34 --GVCFGQBands 35 --GVCFGQBands 36 --GVCFGQBands 37 --GVCFGQBands 38 --GVCFGQBands 39 --GVCFGQBands 40 --GVCFGQBands 41 --GVCFGQBands 42 --GVCFGQBands 43 --GVCFGQBands 44 --GVCFGQBands 45 --GVCFGQBands 46 --GVCFGQBands 47 --GVCFGQBands 48 --GVCFGQBands 49 --GVCFGQBands 50 --GVCFGQBands 51 --GVCFGQBands 52 --GVCFGQBands 53 --GVCFGQBands 54 --GVCFGQBands 55 --GVCFGQBands 56 --GVCFGQBands 57 --GVCFGQBands 58 --GVCFGQBands 59 --GVCFGQBands 60 --GVCFGQBands 70 --GVCFGQBands 80 --GVCFGQBands 90 --GVCFGQBands 99 --indelSizeToEliminateInRefModel 10 --useAllelesTrigger false --dontTrimActiveRegions false --maxDiscARExtension 25 --maxGGAARExtension 300 --paddingAroundIndels 150 --paddingAroundSNPs 20 --kmerSize 10 --kmerSize 25 --dontIncreaseKmerSizesForCycles false --allowNonUniqueKmersInRef false --numPruningSamples 1 --recoverDanglingHeads false --doNotRecoverDanglingBranches false --minDanglingBranchLength 4 --consensus false --maxNumHaplotypesInPopulation 128 --errorCorrectKmers false --minPruning 2 --debugGraphTransformations false --kmerLengthForReadErrorCorrection 25 --minObservationsForKmerToBeSolid 20 --likelihoodCalculationEngine PairHMM --base_quality_score_threshold 18 --gcpHMM 10 --pair_hmm_implementation FASTEST_AVAILABLE --pcr_indel_model CONSERVATIVE --phredScaledGlobalReadMismappingRate 45 --nativePairHmmThreads 4 --useDoublePrecision false --debug false --useFilteredReadsForAnnotations false --emitRefConfidence NONE --bamWriterType CALLED_HAPLOTYPES --disableOptimizations false --justDetermineActiveRegions false --dontGenotype false --dontUseSoftClippedBases false --captureAssemblyFailureBAM false --errorCorrectReads false --doNotRunPhysicalPhasing false --min_base_quality_score 10 --useNewAFCalculator false --annotateNDA false --heterozygosity 0.001 --indel_heterozygosity 1.25E-4 --heterozygosity_stdev 0.01 --standard_min_confidence_threshold_for_calling 10.0 --max_alternate_alleles 6 --max_genotype_count 1024 --genotyping_mode DISCOVERY --contamination_fraction_to_filter 0.0 --output_mode EMIT_VARIANTS_ONLY --allSitePLs false --readShardSize 5000 --readShardPadding 100 --minAssemblyRegionSize 50 --maxAssemblyRegionSize 300 --assemblyRegionPadding 100 --maxReadsPerAlignmentStart 50 --activeProbabilityThreshold 0.002 --maxProbPropagationDistance 50 --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --readValidationStringency SILENT --secondsBetweenProgressUpdates 10.0 --disableSequenceDictionaryValidation false --createOutputBamIndex true --createOutputBamMD5 false --createOutputVariantIndex true --createOutputVariantMD5 false --lenient false --addOutputSAMProgramRecord true --addOutputVCFCommandLine true --cloudPrefetchBuffer 40 --cloudIndexPrefetchBuffer -1 --disableBamIndexCaching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --minimumMappingQuality 20
[July 24, 2017 1:40:48 PM EDT] Executing as [email protected] on Linux 4.10.0-27-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_141-b15; Version: 4.beta.2
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 5
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:40:49.017 INFO HaplotypeCaller - Deflater: JdkDeflater
13:40:49.017 INFO HaplotypeCaller - Inflater: JdkInflater
13:40:49.017 INFO HaplotypeCaller - Initializing engine
13:40:49.254 WARN IntelDeflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
13:40:49.260 WARN IntelDeflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
13:40:49.896 INFO HaplotypeCaller - Done initializing engine
13:40:49.902 INFO HaplotypeCallerEngine - Currently, physical phasing is only available for diploid samples.
13:40:50.226 WARN PossibleDeNovo - Annotation will not be calculated, must provide a valid PED file (-ped) from the command line.
13:40:50.503 WARN PossibleDeNovo - Annotation will not be calculated, must provide a valid PED file (-ped) from the command line.
13:40:50.925 INFO HaplotypeCaller - Shutting down engine
[July 24, 2017 1:40:50 PM EDT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=218628096
java.lang.IllegalArgumentException: samples cannot be empty
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:681)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.ReferenceConfidenceModel.(ReferenceConfidenceModel.java:103)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.initialize(HaplotypeCallerEngine.java:165)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.(HaplotypeCallerEngine.java:146)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.onTraversalStart(HaplotypeCaller.java:200)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:836)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:189)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
at org.broadinstitute.hellbender.Main.main(Main.java:230)

Best Answer

Answers

  • Thank you Sheila, my BAM file failed several of the tag requirements so I fixed that and now HaplotypeCaller is running.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @HeidiJTP
    Hi,

    Glad to hear it! :smiley:

    -Sheila

  • RebsRebs Member

    Hi, I just had the same problem.
    After running ValidateSamFile, this appeared:

    _java -jar /home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar ValidateSamFile -I merged_dup_fixed.bam -M SUMMARY
    12:56:19.255 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    [Wed Feb 07 12:56:19 CET 2018] ValidateSamFile --INPUT merged_dup_fixed.bam --MODE SUMMARY --MAX_OUTPUT 100 --IGNORE_WARNINGS false --VALIDATE_INDEX true --INDEX_VALIDATION_STRINGENCY EXHAUSTIVE --IS_BISULFITE_SEQUENCED false --MAX_OPEN_TEMP_FILES 8000 --SKIP_MATE_VALIDATION false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 1 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
    [Wed Feb 07 12:56:19 CET 2018] Executing as [email protected] on Linux 4.13.0-32-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12; Deflater: Intel; Inflater: Intel; Picard version: Version:4.0.1.0
    INFO 2018-02-07 12:58:41 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:02:21s. Time for last 10,000,000: 141s. Last read position: Pf3D7_08_v3:549,665
    INFO 2018-02-07 13:01:17 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:04:58s. Time for last 10,000,000: 156s. Last read position: */*

    HISTOGRAM java.lang.String

    Error Type Count
    ERROR:MISSING_READ_GROUP 1
    WARNING:RECORD_MISSING_READ_GROUP 20059004

    [Wed Feb 07 13:01:19 CET 2018] picard.sam.ValidateSamFile done. Elapsed time: 5.00 minutes.
    Runtime.totalMemory()=576716800
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Tool returned:
    2_

    What should I do to fix the error?
    Thanks!

  • RebsRebs Member

    Solved

    @Rebs said:
    Hi, I just had the same problem.
    After running ValidateSamFile, this appeared:

    _java -jar /home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar ValidateSamFile -I merged_dup_fixed.bam -M SUMMARY
    12:56:19.255 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    [Wed Feb 07 12:56:19 CET 2018] ValidateSamFile --INPUT merged_dup_fixed.bam --MODE SUMMARY --MAX_OUTPUT 100 --IGNORE_WARNINGS false --VALIDATE_INDEX true --INDEX_VALIDATION_STRINGENCY EXHAUSTIVE --IS_BISULFITE_SEQUENCED false --MAX_OPEN_TEMP_FILES 8000 --SKIP_MATE_VALIDATION false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 1 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
    [Wed Feb 07 12:56:19 CET 2018] Executing as [email protected] on Linux 4.13.0-32-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12; Deflater: Intel; Inflater: Intel; Picard version: Version:4.0.1.0
    INFO 2018-02-07 12:58:41 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:02:21s. Time for last 10,000,000: 141s. Last read position: Pf3D7_08_v3:549,665
    INFO 2018-02-07 13:01:17 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:04:58s. Time for last 10,000,000: 156s. Last read position: */*

    HISTOGRAM java.lang.String

    Error Type Count
    ERROR:MISSING_READ_GROUP 1
    WARNING:RECORD_MISSING_READ_GROUP 20059004

    [Wed Feb 07 13:01:19 CET 2018] picard.sam.ValidateSamFile done. Elapsed time: 5.00 minutes.
    Runtime.totalMemory()=576716800
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Tool returned:
    2_

    What should I do to fix the error?
    Thanks!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Rebs
    Hi,

    Glad to hear it. I suspect you used Picard's AddOrReplaceReadGroups

    -Sheila

Sign In or Register to comment.