Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Trying to gather InsertSizeMetrics from the .bam file but getting an error.

[Mon Oct 02 18:12:59 GMT 2017] CollectInsertSizeMetrics HISTOGRAM_FILE=insert_size_histogram.pdf INPUT=input.bam OUTPUT=output_insert_size_metrics.txt DEVIATIONS=10.0 MINIMUM_PCT=0.05 METRIC_ACCUMULATION_LEVEL=[ALL_READS] INCLUDE_DUPLICATES=false ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
WARNING 2017-10-02 18:12:59 SinglePassSamProgram File reports sort order 'queryname', assuming it's coordinate sorted anyway.
WARNING 2017-10-02 18:12:59 CollectInsertSizeMetrics All data categories were discarded because they contained < 0.05 of the total aligned paired data.
WARNING 2017-10-02 18:12:59 CollectInsertSizeMetrics Total mapped pairs in all categories: 0.0
[Mon Oct 02 18:12:59 GMT 2017] picard.analysis.CollectInsertSizeMetrics done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=126877696

Tagged:

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @bio_d
    Hi,

    It looks like your reads are not mapped properly. How did you map the reads to your reference genome? What kind of data are you working with?

    Thanks,
    Sheila

  • Hi Sheila,

    I am trying to do a De Novo Assembly. I have fastq files for paired-end and mate-pair library (two for each of the 200bp, 5.2Kb and 10Kb) and as a pre-processing step I want to collect the insert metrics and jump metrics for the FRAG_INSERT (paired-end), JUMP_READS (mate-pair) and LONG_JUMP_READS(mate-pair). This is only the first step and two more (similar steps) is what I perceive necessary (one each for the two types of mate-pair reads) for the pre-processing step.

    Following which I plan to create two csv files in_libs.csv and in_groups.csv and use ALLPATHS-LG (The ALLPATHS-LG manual r). Since, it is De Novo I don't think I need or rather don't have a reference available. I might be absolutely wrong but I don't know how else to proceed. How do I use Collect metrics for both paired-end and mate-pair libraries because as I understand the bam/ sam files that I generated using FastqToSam Tool were unaligned BAM. Hence, I am not able to use CollectInsertSizeMetrics!!

    However, there must be some way/ Tool to get aligned BAM files using fastq files even for the de novo type assembly.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @bio_d
    Hi,

    I will continue this in this thread. No need to post twice. It just creates clutter on the forum.

    -Sheila

Sign In or Register to comment.