Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

Cannot retrieve file pointer positions in SAM file

Hello,

I am receiving the following error. I am working with SAM files that were exported from CLC, then edited with Picard-tools to addReadGroups. I am not sure if I need to add an additional step to solve this problem, I cannot find any documentation regarding this error.

Please let me know what I need to do to correct this issue.

Thank you!

gatk -T HaplotypeCaller -R spinach_assembly-repeatdetect_PACBIO_V1.3_formated_60.fa -I .sam.list -drf DuplicateRead --alleles Unfiltered_Spinach_PacBio_Reseq_12_Geno_Assay_SNP.fixed.noblanks.vcf --genotyping_mode GENOTYPE_GIVEN_ALLELES --output_mode EMIT_ALL_SITES -o output_raw_unfiltered_spinach_snps_gbs.vcf
INFO 14:48:44,450 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:48:44,453 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-46-gbc02625, Compiled 2015/07/09 17:38:12
INFO 14:48:44,454 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 14:48:44,454 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 14:48:44,458 HelpFormatter - Program Args: -T HaplotypeCaller -R spinach_assembly-repeatdetect_PACBIO_V1.3_formated_60.fa -I .sam.list -drf DuplicateRead --alleles Unfiltered_Spinach_PacBio_Reseq_12_Geno_Assay_SNP.fixed.noblanks.vcf --genotyping_mode GENOTYPE_GIVEN_ALLELES --output_mode EMIT_ALL_SITES -o output_raw_unfiltered_spinach_snps_gbs.vcf
INFO 14:48:44,468 HelpFormatter - Executing as [email protected] on Linux 2.6.18-348.12.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13.
INFO 14:48:44,469 HelpFormatter - Date/Time: 2015/07/31 14:48:44
INFO 14:48:44,469 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:48:44,470 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:48:45,102 GenomeAnalysisEngine - Strictness is SILENT
INFO 14:48:45,385 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500
INFO 14:48:45,394 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 14:48:48,432 SAMDataSource$SAMReaders - Init 50 BAMs in last 3.04 s, 50 of 80 in 3.04 s / 0.05 m (16.46 tasks/s). 30 remaining with est. completion in 1.82 s / 0.03 m
INFO 14:48:50,052 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 4.66
INFO 14:48:50,164 HCMappingQualityFilter - Filtering out reads with MAPQ < 20
INFO 14:48:54,742 RMDTrackBuilder - Writing Tribble index to disk for file /local/scratch/scratch/Amanda/Spinach_GBS/Unfiltered_Spinach_PacBio_Reseq_12_Geno_Assay_SNP.fixed.noblanks.vcf.idx
INFO 14:48:58,784 GenomeAnalysisEngine - Preparing for traversal over 80 BAM files
INFO 14:49:00,054 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A BAM ERROR has occurred (version 3.4-46-gbc02625):
ERROR
ERROR This means that there is something wrong with the BAM file(s) you provided.
ERROR The error message below tells you what is the problem.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum until you have followed these instructions:
ERROR - Make sure that your BAM file is well-formed by running Picard's validator on it
ERROR (see http://picard.sourceforge.net/command-line-overview.shtml#ValidateSamFile for details)
ERROR - Ensure that your BAM index is not corrupted: delete the current one and regenerate it with 'samtools index'
ERROR
ERROR MESSAGE: Cannot retrieve file pointers within SAM text files.
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • ahulseahulse Member

    Additionally when I run ValidateSamFile I receive errors involving NM tag (nucleotid
    e differences) is missing. I am not sure how to correct this and if this is what is generating the other error.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @ahulse
    Hi,

    I am not sure if the -I file can have a .list extension. Can you try renaming the file to a .bam file? Also, please post the exact error message you get from Picard's Validate Same File.

    Thanks,
    Sheila

  • ahulseahulse Member

    Hi Geraldine,

    You are right, it had been awhile since I used GATK last time and forgot it only accepts bam format. That solved the problem. For people using many files, the .list option is very useful and what I have been using.

Sign In or Register to comment.