We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GATK and ExomeCNV

Hi, I'm trying to use ExomeCNV to detect the CNV on 2 chromosomes (13 and 17 for BRCA1 and BRCA2).
I use the manual on https://secure.genome.ucla.edu/index.php/ExomeCNV_User_Guide and for first part (the GATK part) I use the code on the instruction with the only variant on reference genome (I use hg19.fasta, is it correct?).
My output is;
INFO 13:19:22,999 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:19:23,000 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56
INFO 13:19:23,001 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 13:19:23,001 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 13:19:23,003 HelpFormatter - Program Args: -T DepthOfCoverage -omitBaseOutput -omitLocusTable -R ../../../reference_genome/hg19.fasta -I ../OG040.bam -L ../../../reference_genome/exome.interval_list -o output_controllo.coverage
INFO 13:19:23,005 HelpFormatter - Executing as [email protected] on Linux 4.4.0-22-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14.
INFO 13:19:23,005 HelpFormatter - Date/Time: 2016/05/17 13:19:22
INFO 13:19:23,005 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:19:23,006 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:19:23,295 GenomeAnalysisEngine - Strictness is SILENT
INFO 13:19:23,348 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 13:19:23,352 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 13:19:23,369 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO 13:19:23,381 IntervalUtils - Processing 18624 bp from intervals
INFO 13:19:23,423 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 13:19:23,442 GenomeAnalysisEngine - Done preparing for traversal
INFO 13:19:23,442 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 13:19:23,442 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 13:19:23,442 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 13:19:23,443 DepthOfCoverage - Per-Locus Depth of Coverage output was omitted
INFO 13:19:42,619 DepthOfCoverage - Printing summary info
INFO 13:19:43,028 ProgressMeter - done 45657.0 19.0 s 7.1 m 99.9% 19.0 s 0.0 s
INFO 13:19:43,028 ProgressMeter - Total runtime 19.59 secs, 0.33 min, 0.01 hours
INFO 13:19:43,030 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 482973 total reads (0.00%)
INFO 13:19:43,030 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 13:19:43,030 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
INFO 13:19:43,031 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 13:19:43,031 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 13:19:43,031 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
INFO 13:19:43,031 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
INFO 13:19:44,231 GATKRunReport - Uploaded run statistics report to AWS S3
Is it correct?
Then i proceed with the second part but when I try to load output.coverage.sample_interval_summary I have an error, in particular:
"The line 1 doesn't have 15 elements".
Where am I wrong?
Thank you for the help
Best Answers
-
Pandora Italy ✭
The exact command that I run is:
java -jar ~/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -T DepthOfCoverage -omitBaseOutput -omitLocusTable -R ../../reference_genome/GRCh37.p13.genome.fa -I OG040.bam -L ../../reference_genome/exome.interval_list -o Coverage/output.coverageand my output is:
_INFO 10:03:41,669 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:03:41,737 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56
INFO 10:03:41,737 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 10:03:41,737 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 10:03:41,740 HelpFormatter - Program Args: -T DepthOfCoverage -omitBaseOutput -omitLocusTable -R ../../reference_genome/GRCh37.p13.genome.fa -I OG040.bam -L ../../reference_genome/exome.interval_list -o Coverage/output.coverage
INFO 10:03:41,775 HelpFormatter - Executing as [email protected] on Linux 4.4.0-22-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14.
INFO 10:03:41,775 HelpFormatter - Date/Time: 2016/05/19 10:03:41
INFO 10:03:41,775 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:03:41,775 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:03:42,572 GenomeAnalysisEngine - Strictness is SILENT
INFO 10:03:43,062 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 10:03:43,066 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 10:03:43,143 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.06
INFO 10:03:43,178 IntervalUtils - Processing 18624 bp from intervals
INFO 10:03:43,626 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 10:03:43,688 GenomeAnalysisEngine - Done preparing for traversal
INFO 10:03:43,689 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 10:03:43,689 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 10:03:43,689 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 10:03:43,689 DepthOfCoverage - Per-Locus Depth of Coverage output was omitted
INFO 10:04:10,273 DepthOfCoverage - Printing summary info
INFO 10:04:10,282 ProgressMeter - done 45657.0 26.0 s 9.7 m 99.9% 26.0 s 0.0 s
INFO 10:04:10,282 ProgressMeter - Total runtime 26.59 secs, 0.44 min, 0.01 hours
INFO 10:04:10,285 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 482973 total reads (0.00%)
INFO 10:04:10,285 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
INFO 10:04:12,810 GATKRunReport - Uploaded run statistics report to AWS S3 _But I don't know if this process give me the file in the correct form. Is it normal all of "0.00% reads..."? How can I know if my output is in the correct form? Last question: is it correct the reference genome that I used (GRCh37.p13)?
Thank's for answer!
Answers
@Pandora
Hi,
Are you asking about the R command? If so, I'm afraid we cannot help. However, if you are asking about a GATK command, please post the exact command you ran and the exact error message you get.
Thanks,
Sheila
The exact command that I run is:
java -jar ~/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -T DepthOfCoverage -omitBaseOutput -omitLocusTable -R ../../reference_genome/GRCh37.p13.genome.fa -I OG040.bam -L ../../reference_genome/exome.interval_list -o Coverage/output.coverage
and my output is:
_INFO 10:03:41,669 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:03:41,737 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56
INFO 10:03:41,737 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 10:03:41,737 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 10:03:41,740 HelpFormatter - Program Args: -T DepthOfCoverage -omitBaseOutput -omitLocusTable -R ../../reference_genome/GRCh37.p13.genome.fa -I OG040.bam -L ../../reference_genome/exome.interval_list -o Coverage/output.coverage
INFO 10:03:41,775 HelpFormatter - Executing as [email protected] on Linux 4.4.0-22-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14.
INFO 10:03:41,775 HelpFormatter - Date/Time: 2016/05/19 10:03:41
INFO 10:03:41,775 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:03:41,775 HelpFormatter - --------------------------------------------------------------------------------
INFO 10:03:42,572 GenomeAnalysisEngine - Strictness is SILENT
INFO 10:03:43,062 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 10:03:43,066 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 10:03:43,143 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.06
INFO 10:03:43,178 IntervalUtils - Processing 18624 bp from intervals
INFO 10:03:43,626 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 10:03:43,688 GenomeAnalysisEngine - Done preparing for traversal
INFO 10:03:43,689 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 10:03:43,689 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 10:03:43,689 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 10:03:43,689 DepthOfCoverage - Per-Locus Depth of Coverage output was omitted
INFO 10:04:10,273 DepthOfCoverage - Printing summary info
INFO 10:04:10,282 ProgressMeter - done 45657.0 26.0 s 9.7 m 99.9% 26.0 s 0.0 s
INFO 10:04:10,282 ProgressMeter - Total runtime 26.59 secs, 0.44 min, 0.01 hours
INFO 10:04:10,285 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 482973 total reads (0.00%)
INFO 10:04:10,285 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
INFO 10:04:10,286 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
INFO 10:04:12,810 GATKRunReport - Uploaded run statistics report to AWS S3 _
But I don't know if this process give me the file in the correct form. Is it normal all of "0.00% reads..."? How can I know if my output is in the correct form? Last question: is it correct the reference genome that I used (GRCh37.p13)?
Thank's for answer!
@Pandora
Hi,
Yes, your command looks fine and the output (no reads filtered is fine). However, did you mark duplicates? The reference you used must be fine as long as you aligned your reads to it. (I'm guessing it is also fine because you did not get an error
I suspect the issue you are having is with the next step, the R command.
-Sheila
@Pandora
Hi again,
You should look into using the GATK tools for CNV! Here is a start
-Sheila
I think @Sheila means http://gatkforums.broadinstitute.org/gatk/categories/gatk-4-alpha (ReCapSeg is the old, pre-GATK version of the CNV tools).