Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Exception counting mismatches

SDRSDR CornellMember

Hi keep getting the following error when running either MergeBamAlignment or ValidateSamFile with a reference:
Exception in thread "main" htsjdk.samtools.SAMException: Exception counting mismatches for read M01032:400:000000000-APJJ1:1:1114:13732:10822 1/2 0b aligned read.
at htsjdk.samtools.util.SequenceUtil.countMismatches(SequenceUtil.java:427)
at htsjdk.samtools.util.SequenceUtil.calculateSamNmTag(SequenceUtil.java:634)
at htsjdk.samtools.SamFileValidator.validateNmTag(SamFileValidator.java:463)
at htsjdk.samtools.SamFileValidator.validateSamRecordsAndQualityFormat(SamFileValidator.java:288)
at htsjdk.samtools.SamFileValidator.validateSamFile(SamFileValidator.java:200)
at htsjdk.samtools.SamFileValidator.validateSamFileVerbose(SamFileValidator.java:160)
at picard.sam.ValidateSamFile.doWork(ValidateSamFile.java:190)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at htsjdk.samtools.util.SequenceUtil.countMismatches(SequenceUtil.java:419)
... 9 more

I have tried cleaning and sorted the sam file.

The read in question has multiple hits:

M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_5_length_119413_cov_154.977_ID_5903 1 0 101H127M73H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_5_length_119413_cov_154.977_ID_5903 1 0 110H127M64H = 1.19287E+11 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_5_length_119413_cov_154.977_ID_5903 119287 0 64H127M110H = 119287127 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 113 NODE_5_length_119413_cov_154.977_ID_5903 119287 0 73S127M101S NODE_8_length_84624_cov_96.3483_ID_5909 1 0 TTCCGATCTTACGTAAAAAAATCCTGTCGATAGAATATCATAGATATTCATCAACAGGAAAAATTATACAGCCGATATTACATCTAGTCCGTTGAAATACGGATCACGGGTATTTGTCGCCAAGTGTAGTGAACCACAAGCTGGTAACGGTGACGTTATGAGTGAAGGAAGTGGCTAGCGAGTTCTCCGAGCCTACGAACAGAAACTTAGTCAGGCCAAATGGTGGATAAGACTGCATAACAAGTCGAAGTCCAGAGGACAAACGTAAAACCAAGACGGTAAACTAAGAGGTTATGGAAAA [email protected]GGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCCCCC SA:Z:NODE_17_length_38527_cov_274.773_ID_5927,38399,+,226S67M8S,0,0; MD:Z:127 RG:Z:1 NM:i:0AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_29_length_17544_cov_151.644_ID_5951 17419 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_29_length_17544_cov_151.644_ID_5951 17419 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_16_length_41918_cov_130.954_ID_5925 62 0 8H65M228H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_16_length_41918_cov_130.954_ID_5925 63 0 64M237H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_32_length_12481_cov_127.324_ID_5957 63 0 8H65M228H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_32_length_12481_cov_127.324_ID_5957 64 0 64M237H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_32_length_12481_cov_127.324_ID_5957 12355 0 64H127M110H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_32_length_12481_cov_127.324_ID_5957 12355 0 73H127M101H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_11_length_51109_cov_119.86_ID_5915 50983 0 64H127M110H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_11_length_51109_cov_119.86_ID_5915 50983 0 73H127M101H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_41_length_7766_cov_110.834_ID_5975 89 0 62M239H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:39T22 RG:Z:1 NM:i:1 AS:i:57
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_15_length_42291_cov_97.4995_ID_5923 42165 0 64H127M110H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_15_length_42291_cov_97.4995_ID_5923 42165 0 73H127M101H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_47_length_3395_cov_93.1233_ID_5987 3270 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_47_length_3395_cov_93.1233_ID_5987 3270 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_28_length_18605_cov_70.179_ID_5949 18479 0 64H127M110H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_28_length_18605_cov_70.179_ID_5949 18479 0 73H127M101H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_24_length_24844_cov_87.6919_ID_5941 24719 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_24_length_24844_cov_87.6919_ID_5941 24719 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_14_length_43013_cov_80.8389_ID_5921 63 0 8H65M228H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_14_length_43013_cov_80.8389_ID_5921 64 0 64M237H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_14_length_43013_cov_80.8389_ID_5921 42887 0 64H127M110H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_14_length_43013_cov_80.8389_ID_5921 42887 0 73H127M101H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_36_length_8809_cov_72.3528_ID_5965 8683 0 64H127M110H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_36_length_8809_cov_72.3528_ID_5965 8683 0 73H127M101H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_4_length_139693_cov_52.4582_ID_5901 138847 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_4_length_139693_cov_52.4582_ID_5901 138847 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_20_length_32899_cov_55.6828_ID_5933 32774 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_20_length_32899_cov_55.6828_ID_5933 32774 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_17_length_38527_cov_274.773_ID_5927 38399 0 226H67M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 TCGGCTGTATAATTTTTCCTGTTGATGAATATCTATGATATTCTATCGACAGGATTTTTTTACGTAA [email protected]FFB SA:Z:NODE_5_length_119413_cov_154.977_ID_5903,119287,-,73S127M101S,0,0; MD:Z:67 RG:Z:1 NM:i:0
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_17_length_38527_cov_274.773_ID_5927 38399 0 235H66M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 TCGGCTGTATAATTTTTCCTGTTGATGAATATCTATGATATTCTATCGACAGGATTTTTTTACGTA GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCCCCC SA:Z:NODE_8_length_84624_cov_96.3483_ID_5909,1,-,110S127M64S,0,0; MD:Z:66 RG:Z:1 NM:i:0
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_8_length_84624_cov_96.3483_ID_5909 1 0 101H127M73H = 1 127 MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 177 NODE_8_length_84624_cov_96.3483_ID_5909 1 0 110S127M64S NODE_5_length_119413_cov_154.977_ID_5903 119287 0 TTCCGATCTTTTTCCATAACCTCTTGTTTTACCGTTTTGGTTTTACGTTTGTCCTCTGGACTTCGACTTGTTATGCAGTCTTATCCACCATTTGGCCTGACTAAGTTTCTGTTCGTAGGCTCGGAGAACTCGCTAGCCACTTCCTTCACTCATAACGTCACCGTTACCAGCTTGTGGTTCACTACACTTGGCGACAAATACCCGTGATCCGTATTTCAACGGACTAGATGTAATATCGGCTGTATAATTTTTCCTGTTGATGAATATCTATGATATTCTATCGACAGGATTTTTTTACGTA -((-))16?FBD>:.).()<30424))444((-((.7;2<<,,33773;3,(2;7)9)7(@D<;;815;BEEEFB:5554584:>@B:3A=DFFFFFFFEEEFFFAFAFFFBFAGFFFDDGGGFFFGDDAGDGGGGGGGGGF8GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCCCCC SA:Z:NODE_17_length_38527_cov_274.773_ID_5927,38399,-,235S66M,0,0; MD:Z:127 RG:Z:1 NM:i:0
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_26_length_22602_cov_104.979_ID_5945 22477 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_26_length_22602_cov_104.979_ID_5945 22477 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_40_length_7807_cov_98.1341_ID_5973 7681 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_40_length_7807_cov_98.1341_ID_5973 7681 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_22_length_30412_cov_108.492_ID_5937 30287 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_22_length_30412_cov_108.492_ID_5937 30287 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_45_length_5018_cov_114.587_ID_5983 4894 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_45_length_5018_cov_114.587_ID_5983 4894 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_46_length_4846_cov_113.639_ID_5985 1 0 101H127M73H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_46_length_4846_cov_113.639_ID_5985 1 0 110H127M64H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:127 RG:Z:1 NM:i:0 AS:i:127
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_10_length_52602_cov_118.163_ID_5913 52477 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_10_length_52602_cov_118.163_ID_5913 52477 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_7_length_88768_cov_134.873_ID_5907 62 0 8H65M228H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_7_length_88768_cov_134.873_ID_5907 63 0 64M237H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 353 NODE_7_length_88768_cov_134.873_ID_5907 88642 0 228H65M8H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:65 RG:Z:1 NM:i:0 AS:i:65
M01032:400:000000000-APJJ1:1:1114:13732:10822 433 NODE_7_length_88768_cov_134.873_ID_5907 88642 0 237H64M NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:64 RG:Z:1 NM:i:0 AS:i:64
M01032:400:000000000-APJJ1:1:1114:13732:10822 369 NODE_13_length_45986_cov_146.992_ID_5919 63 0 8H67M226H NODE_8_length_84624_cov_96.3483_ID_5909 1 0 * * MD:Z:67 RG:Z:1 NM:i:0 AS:i:67
M01032:400:000000000-APJJ1:1:1114:13732:10822 417 NODE_13_length_45986_cov_146.992_ID_5919 64 0 66M235H NODE_5_length_119413_cov_154.977_ID_5903 119287 0 * * MD:Z:66 RG:Z:1 NM:i:0 AS:i:66

Thank you for any help!
Sara

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @SDR
    Hi Sara,

    Can you please post the BAM header and FASTA .dict file contents here? This looks like an issue where your contigs are not ordered the same in the files.

    -Sheila

  • reem_alhamidireem_alhamidi Nottingham UKMember
    Hey, I am getting the same error message. How can I resolve this issue? I am using the ValidateSamFile tag.
  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭
    edited July 2

    Not relevant.. Sorry.. my mistake...

    Post edited by SkyWarrior on
  • reem_alhamidireem_alhamidi Nottingham UKMember
    I have two samples, and each sample consists of 5 bam files (amplicons). Originally, my files consisted of 2 fastq files (5 PCRs/amplicons were pooled togethere for sequencing), in which I created 2 bam files where all my reads are mapped to a custom reference sequeunce. After that, I created 5 different bam files (for each sample) that corresponds to each amplicon. 5 out of 10 of my bam files worked perfectly, but I'm left with the other 5 bam files with that error message. I'm not sure how the rest have worked with the same syntax that I used.
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @reem_alhamidi

    Would you please post the version of GATK you are using, the exact command and the entire error log.

Sign In or Register to comment.