To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

IllegalArgumentException: samples cannot be empty

I am trying to run HaplotypeCaller on some data that I know is messy and would fail some of the filters, so I have run it both with and without --disableToolDefaultReadFilters. Either way I don't get any output file, but I do get a message "samples cannot be empty", Does this mean that my data is still failing some built-in control, or am I doing something else wrong? I have checked the SQ, and when I run CountReads (with --disableToolDefaultReadFilters) it results in "Tool returned: 24634".

Here's my command:

$ java -jar ~/Downloads/gatk-4.beta.2/gatk-package-4.beta.2-local.jar HaplotypeCaller -R DQA_contig.fasta -ploidy 50 -I IRL-A.bam.sorted.bam -O IRL-A.vcf --disableToolDefaultReadFilters
13:40:49.007 WARN IntelGKLUtils - Error starting process to check for AVX support : grep -i avx /proc/cpuinfo
13:40:49.014 WARN IntelGKLUtils - Error starting process to check for AVX support : grep -i avx /proc/cpuinfo
[July 24, 2017 1:40:48 PM EDT] HaplotypeCaller --sample_ploidy 50 --output IRL-A.vcf --input IRL-A.bam.sorted.bam --reference DQA_contig.fasta --disableToolDefaultReadFilters true --group StandardAnnotation --group StandardHCAnnotation --GVCFGQBands 1 --GVCFGQBands 2 --GVCFGQBands 3 --GVCFGQBands 4 --GVCFGQBands 5 --GVCFGQBands 6 --GVCFGQBands 7 --GVCFGQBands 8 --GVCFGQBands 9 --GVCFGQBands 10 --GVCFGQBands 11 --GVCFGQBands 12 --GVCFGQBands 13 --GVCFGQBands 14 --GVCFGQBands 15 --GVCFGQBands 16 --GVCFGQBands 17 --GVCFGQBands 18 --GVCFGQBands 19 --GVCFGQBands 20 --GVCFGQBands 21 --GVCFGQBands 22 --GVCFGQBands 23 --GVCFGQBands 24 --GVCFGQBands 25 --GVCFGQBands 26 --GVCFGQBands 27 --GVCFGQBands 28 --GVCFGQBands 29 --GVCFGQBands 30 --GVCFGQBands 31 --GVCFGQBands 32 --GVCFGQBands 33 --GVCFGQBands 34 --GVCFGQBands 35 --GVCFGQBands 36 --GVCFGQBands 37 --GVCFGQBands 38 --GVCFGQBands 39 --GVCFGQBands 40 --GVCFGQBands 41 --GVCFGQBands 42 --GVCFGQBands 43 --GVCFGQBands 44 --GVCFGQBands 45 --GVCFGQBands 46 --GVCFGQBands 47 --GVCFGQBands 48 --GVCFGQBands 49 --GVCFGQBands 50 --GVCFGQBands 51 --GVCFGQBands 52 --GVCFGQBands 53 --GVCFGQBands 54 --GVCFGQBands 55 --GVCFGQBands 56 --GVCFGQBands 57 --GVCFGQBands 58 --GVCFGQBands 59 --GVCFGQBands 60 --GVCFGQBands 70 --GVCFGQBands 80 --GVCFGQBands 90 --GVCFGQBands 99 --indelSizeToEliminateInRefModel 10 --useAllelesTrigger false --dontTrimActiveRegions false --maxDiscARExtension 25 --maxGGAARExtension 300 --paddingAroundIndels 150 --paddingAroundSNPs 20 --kmerSize 10 --kmerSize 25 --dontIncreaseKmerSizesForCycles false --allowNonUniqueKmersInRef false --numPruningSamples 1 --recoverDanglingHeads false --doNotRecoverDanglingBranches false --minDanglingBranchLength 4 --consensus false --maxNumHaplotypesInPopulation 128 --errorCorrectKmers false --minPruning 2 --debugGraphTransformations false --kmerLengthForReadErrorCorrection 25 --minObservationsForKmerToBeSolid 20 --likelihoodCalculationEngine PairHMM --base_quality_score_threshold 18 --gcpHMM 10 --pair_hmm_implementation FASTEST_AVAILABLE --pcr_indel_model CONSERVATIVE --phredScaledGlobalReadMismappingRate 45 --nativePairHmmThreads 4 --useDoublePrecision false --debug false --useFilteredReadsForAnnotations false --emitRefConfidence NONE --bamWriterType CALLED_HAPLOTYPES --disableOptimizations false --justDetermineActiveRegions false --dontGenotype false --dontUseSoftClippedBases false --captureAssemblyFailureBAM false --errorCorrectReads false --doNotRunPhysicalPhasing false --min_base_quality_score 10 --useNewAFCalculator false --annotateNDA false --heterozygosity 0.001 --indel_heterozygosity 1.25E-4 --heterozygosity_stdev 0.01 --standard_min_confidence_threshold_for_calling 10.0 --max_alternate_alleles 6 --max_genotype_count 1024 --genotyping_mode DISCOVERY --contamination_fraction_to_filter 0.0 --output_mode EMIT_VARIANTS_ONLY --allSitePLs false --readShardSize 5000 --readShardPadding 100 --minAssemblyRegionSize 50 --maxAssemblyRegionSize 300 --assemblyRegionPadding 100 --maxReadsPerAlignmentStart 50 --activeProbabilityThreshold 0.002 --maxProbPropagationDistance 50 --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --readValidationStringency SILENT --secondsBetweenProgressUpdates 10.0 --disableSequenceDictionaryValidation false --createOutputBamIndex true --createOutputBamMD5 false --createOutputVariantIndex true --createOutputVariantMD5 false --lenient false --addOutputSAMProgramRecord true --addOutputVCFCommandLine true --cloudPrefetchBuffer 40 --cloudIndexPrefetchBuffer -1 --disableBamIndexCaching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --minimumMappingQuality 20
[July 24, 2017 1:40:48 PM EDT] Executing as heidi@heidi-HP-Pavilion-dv6-Notebook-PC on Linux 4.10.0-27-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_141-b15; Version: 4.beta.2
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 5
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
13:40:49.017 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:40:49.017 INFO HaplotypeCaller - Deflater: JdkDeflater
13:40:49.017 INFO HaplotypeCaller - Inflater: JdkInflater
13:40:49.017 INFO HaplotypeCaller - Initializing engine
13:40:49.254 WARN IntelDeflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
13:40:49.260 WARN IntelDeflaterFactory - IntelInflater is not supported, using Java.util.zip.Inflater
13:40:49.896 INFO HaplotypeCaller - Done initializing engine
13:40:49.902 INFO HaplotypeCallerEngine - Currently, physical phasing is only available for diploid samples.
13:40:50.226 WARN PossibleDeNovo - Annotation will not be calculated, must provide a valid PED file (-ped) from the command line.
13:40:50.503 WARN PossibleDeNovo - Annotation will not be calculated, must provide a valid PED file (-ped) from the command line.
13:40:50.925 INFO HaplotypeCaller - Shutting down engine
[July 24, 2017 1:40:50 PM EDT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=218628096
java.lang.IllegalArgumentException: samples cannot be empty
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:681)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.ReferenceConfidenceModel.(ReferenceConfidenceModel.java:103)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.initialize(HaplotypeCallerEngine.java:165)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.(HaplotypeCallerEngine.java:146)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.onTraversalStart(HaplotypeCaller.java:200)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:836)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:189)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
at org.broadinstitute.hellbender.Main.main(Main.java:230)

Best Answer

Answers

  • Thank you Sheila, my BAM file failed several of the tag requirements so I fixed that and now HaplotypeCaller is running.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @HeidiJTP
    Hi,

    Glad to hear it! :smiley:

    -Sheila

  • RebsRebs Member

    Hi, I just had the same problem.
    After running ValidateSamFile, this appeared:

    _java -jar /home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar ValidateSamFile -I merged_dup_fixed.bam -M SUMMARY
    12:56:19.255 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    [Wed Feb 07 12:56:19 CET 2018] ValidateSamFile --INPUT merged_dup_fixed.bam --MODE SUMMARY --MAX_OUTPUT 100 --IGNORE_WARNINGS false --VALIDATE_INDEX true --INDEX_VALIDATION_STRINGENCY EXHAUSTIVE --IS_BISULFITE_SEQUENCED false --MAX_OPEN_TEMP_FILES 8000 --SKIP_MATE_VALIDATION false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 1 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
    [Wed Feb 07 12:56:19 CET 2018] Executing as isglobal@isglobal-SATELLITE-PRO-C850-136 on Linux 4.13.0-32-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12; Deflater: Intel; Inflater: Intel; Picard version: Version:4.0.1.0
    INFO 2018-02-07 12:58:41 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:02:21s. Time for last 10,000,000: 141s. Last read position: Pf3D7_08_v3:549,665
    INFO 2018-02-07 13:01:17 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:04:58s. Time for last 10,000,000: 156s. Last read position: */*

    HISTOGRAM java.lang.String

    Error Type Count
    ERROR:MISSING_READ_GROUP 1
    WARNING:RECORD_MISSING_READ_GROUP 20059004

    [Wed Feb 07 13:01:19 CET 2018] picard.sam.ValidateSamFile done. Elapsed time: 5.00 minutes.
    Runtime.totalMemory()=576716800
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Tool returned:
    2_

    What should I do to fix the error?
    Thanks!

  • RebsRebs Member

    Solved

    @Rebs said:
    Hi, I just had the same problem.
    After running ValidateSamFile, this appeared:

    _java -jar /home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar ValidateSamFile -I merged_dup_fixed.bam -M SUMMARY
    12:56:19.255 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/isglobal/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    [Wed Feb 07 12:56:19 CET 2018] ValidateSamFile --INPUT merged_dup_fixed.bam --MODE SUMMARY --MAX_OUTPUT 100 --IGNORE_WARNINGS false --VALIDATE_INDEX true --INDEX_VALIDATION_STRINGENCY EXHAUSTIVE --IS_BISULFITE_SEQUENCED false --MAX_OPEN_TEMP_FILES 8000 --SKIP_MATE_VALIDATION false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 1 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
    [Wed Feb 07 12:56:19 CET 2018] Executing as isglobal@isglobal-SATELLITE-PRO-C850-136 on Linux 4.13.0-32-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12; Deflater: Intel; Inflater: Intel; Picard version: Version:4.0.1.0
    INFO 2018-02-07 12:58:41 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:02:21s. Time for last 10,000,000: 141s. Last read position: Pf3D7_08_v3:549,665
    INFO 2018-02-07 13:01:17 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:04:58s. Time for last 10,000,000: 156s. Last read position: */*

    HISTOGRAM java.lang.String

    Error Type Count
    ERROR:MISSING_READ_GROUP 1
    WARNING:RECORD_MISSING_READ_GROUP 20059004

    [Wed Feb 07 13:01:19 CET 2018] picard.sam.ValidateSamFile done. Elapsed time: 5.00 minutes.
    Runtime.totalMemory()=576716800
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Tool returned:
    2_

    What should I do to fix the error?
    Thanks!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Rebs
    Hi,

    Glad to hear it. I suspect you used Picard's AddOrReplaceReadGroups

    -Sheila

Sign In or Register to comment.