Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

mate_not_found by picard

Hi,
I was using picard to removing duplicates. according to the picard document, sam/bam files were recommended to checked by ValidataSamFile at first. and then I got the following error:
command line: java -Xmx4g -jar ~/picard/picard.jar ValidateSamFile I=/volumes/elements/atac/SRR5808766_RG.sam MODE=SUMMARY
"# HISTOGRAM java. lang. String
error type: count
error: MATE_NOT_FOUND 19598152"

after trying to fix the error by FixMateInformation:
command line: java -Xmx4g -jar ~/picard/picard.jar FixMateInformation I=/volumes/elements/atac/SRR5808766_RG.sam O=/volumes/elements/atac/SRR5808766_fixed_mate.sam

the output sam file was rechecked by ValidateSamFile, and still got the same error:
command line: java -Xmx4g -jar ~/picard/picard.jar ValidateSamFile I=/volumes/elements/atac/SRR5808766_fixed_mate.sam MODE=SUMMARY
"# HISTOGRAM java. lang. String
error type: count
error: MATE_NOT_FOUND 19598152"

so I checked the first lines' content of sam file:
@HD VN:1.0 SO:unsorted
@SQ SN:chr10 LN:130694993
@SQ SN:chr11 LN:122082543
@SQ SN:chr12 LN:120129022
@SQ SN:chr13 LN:120421639
@SQ SN:chr14 LN:124902244
@SQ SN:chr15 LN:104043685
@SQ SN:chr16 LN:98207768
@SQ SN:chr17 LN:94987271
@SQ SN:chr18 LN:90702639
@SQ SN:chr19 LN:61431566
@SQ SN:chr1 LN:195471971
@SQ SN:chr2 LN:182113224
@SQ SN:chr3 LN:160039680
@SQ SN:chr4 LN:156508116
@SQ SN:chr5 LN:151834684
@SQ SN:chr6 LN:149736546
@SQ SN:chr7 LN:145441459
@SQ SN:chr8 LN:129401213
@SQ SN:chr9 LN:124595110
@SQ SN:chrM LN:16299
@SQ SN:chrX LN:171031299
@SQ SN:chrY LN:91744698
@PG ID:Bowtie VN:1.2.2 CL:"/Users/ragna/applications/bowtie/bowtie-align-s --wrapper basic-0 /Users/ragna/applications/bowtie/indexes/genome -1 /volumes/elements/atac/SRR5808766_1.fastq -2 /volumes/elements/atac/SRR5808766_2.fastq -X 2000 -m 1 -p 4 -S"
SRR5808766.49.2 163 chr10 13965004 255 51M = 1396518228 GGCTGGGTGTGATTTCAGTCCCTAGCAGAGAACCAGAAGCGCGCAGCTGGA DDADDIIHIIIIIIIHHIIIIHIIIIIIIIIIIHHIFIIIDHGIIIHEHIE XA:i:0 MD:Z:51 NM:i:0 XM:i:2
SRR5808766.49.1 83 chr10 13965181 255 51M = 1396500-228 GGGGCGCCCTGGGGAGTGACCCCGGGGAAAGCAATTGGGGCGTTTCGAGCT IHIHHHFHHIIIHHHHHIIHGIIIIIIIHHHIIIIIHIHIIIIIIIDCDDD XA:i:0 MD:Z:51 NM:i:0 XM:i:2
SRR5808766.50.1 99 chrM 1441 255 51M = 1472 82 TAGCTGGTTACCCAAAAAATGAATTTAAGTTCAATTTTAAACTTGCTAAAA DDDDDIIIIIIIIIIIIIIIIIIIIIIHHIIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:51 NM:i:0 XM:i:2
SRR5808766.50.2 147 chrM 1472 255 51M = 1441 -82 CAATTTTAAACTTGCTAAAAAAACAACAAAATCAAAAAGTAAGTTTAGATT HIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD XA:i:0 MD:Z:51 NM:i:0 XM:i:2
SRR5808766.51.2 163 chr13 40971416 255 51M = 409714366 TCTCTGTGGTGTTTGCTCAAAGATGAAACTGCTTGGTGACACATTTCTTAC DADCDIIIIIIFHHIHIIIIIIIIGHHIHIIIHHHHHHHHIIIIIIIIIHI XA:i:0 MD:Z:51 NM:i:0 XM:i:2
SRR5808766.51.1 83 chr13 40971431 255 51M = 4097141-66 CTCAAAGATGAAACTGCTTGGTGACACATTTCTTACAACACACCCCTGGAT HIIIIIIIHGHG<IIIHIIIIIHIHIIIIIIIHHIIIHHHIIIHIHDDDDD XA:i:0 MD:Z:51 NM:i:0 XM:i:2
SRR5808766.52.1 77 * 0 0 * * 0 0 CTGCAAATGGTTGTAAAATGCCGTATGGACCAACAATGTTAGGGCCTTTTC DDDDDIIIIIIIIHHHIIIIIIIIIIIIIIIIIIIIHIIHIIIHHIIIIII XM:i:0
SRR5808766.52.2 141 * 0 0 * * 0 0 GAAAAGGCCCTAACATTGTTGGTCCATACGGCATTTTACAACCATTTGCAG DDDDDIIGIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XM:i:0
SRR5808766.53.1 77 * 0 0 * * 0 0 TGTAGGAACCCTAAACCTCATAATTTTATCATTCACAACACACACCTTAGA DDDDDIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XM:i:0
SRR5808766.53.2 141 * 0 0 * * 0 0 TCTAAGGTGTGTGTTGTGAATGATAAAATTATGAGGTTTAGGGTTCCTACA DDDDDIIHIIIHIHIIIIIIIIIIIIIIIIIIIIIIGHIIIIIIIIIIIII XM:i:0

could someone help me to solve this problem.
thanks a lot.

Tagged:

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @lance_tsai
    Hi,

    Can you check if this issue is present in the BAM file after alignment? I am wondering when the issue starts (after MarkDuplicates or after alignment).

    Thanks,
    Sheila

  • @Sheila said:
    @lance_tsai
    Hi,

    Can you check if this issue is present in the BAM file after alignment? I am wondering when the issue starts (after MarkDuplicates or after alignment).

    Thanks,
    Sheila

    Hi, Sheila
    this issue is present in the SAM/BAM file. I don't know how the problem raised during the alignment.
    here is the command line when bowtie was applied to do the alignment:
    "bowtie /Users/ragna/applications/bowtie/indexes/genome -1 /volumes/elements/atac/SRR5808766_1.fastq -2 /volumes/elements/atac/SRR5808766_2.fastq -X 2000 -m 1 -p 4 -S"

    thanks a lot.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
    edited April 2018

    @lance_tsai
    Hi,

    Can you run ValidateSamFile on your BAM file you get from the aligner and see if the errors are present in that? If they are, it is an issue with the aligner. If they are only present after MarkDuplicates, it is an issue with MarkDuplicates.

    -Sheila

  • @Sheila said:
    @lance_tsai
    Hi,

    Can you run ValidateSamFile on your BAM file you get from the aligner and see if the errors are present in that? If they are, it is an issue with the aligner. If they are only present after MarkDuplicates, it is an issue with MarkDuplicates.

    -Sheila

    Hi, Sheila
    It's my fault to give you the confused message. what I posted at first is the results of running ValidatedSamFile on the original SAM file I've got from the bowtie aligner.
    I want to know what is the problem with my aligned sam file? why the mate couldn't be recognized by picard, as you can see the second column which represents "FLAG" exists.
    Would MergeBamAlignment or samtools fixmate fix this problem?

    thanks a lot!

Sign In or Register to comment.