GATK (2.7.4) - RealignTargetCreator error in MicroScheduler

I need help with the following error. Below is my command like

java -Xmx4g -jar /ssd-sdb1/sandeep/GATK-2-7-4/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar \ -T RealignerTargetCreator \ -R /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/ucsc.hg19.fasta \ -I /ssd-sdb1/sandeep/work/data/HT021_Normal_exome_ACAGTG.chr1.sorted.bam \ -L /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed \ -o /ssd-sdb1/sandeep/work/data/ forIndelRealigner.chr1.intervals \ -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf \ -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf

Below is the output...

GATK RealignerTargetCreator begin: Wed Oct 30 07:36:06 MST 2013
INFO 07:36:08,505 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed to be BED
INFO 07:36:08,527 HelpFormatter - --------------------------------------------------------------------------------
INFO 07:36:08,527 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51
INFO 07:36:08,527 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 07:36:08,527 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 07:36:08,531 HelpFormatter - Program Args: -T RealignerTargetCreator -R /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/ucsc.hg19.fasta -I /ssd-sdb1/sandeep/work/data/HT021_Normal_exome_ACAGTG.chr1.sorted.bam -L /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed -o /ssd-sdb1/sandeep/work/data/forIndelRealigner.chr1.intervals -known -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
INFO 07:36:08,531 HelpFormatter - Date/Time: 2013/10/30 07:36:08
INFO 07:36:08,531 HelpFormatter - --------------------------------------------------------------------------------
INFO 07:36:08,531 HelpFormatter - --------------------------------------------------------------------------------
INFO 07:36:08,535 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf to be VCF
INFO 07:36:08,931 GenomeAnalysisEngine - Strictness is SILENT
INFO 07:36:09,085 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 07:36:09,092 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 07:36:09,106 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
INFO 07:36:09,119 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
INFO 07:36:09,179 IntervalUtils - Processing 249250620 bp from intervals
INFO 07:36:09,228 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 07:36:09,570 GenomeAnalysisEngine - Done preparing for traversal
INFO 07:36:09,570 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 07:36:09,570 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 07:36:39,574 ProgressMeter - chr1:23198461 2.32e+07 30.0 s 1.0 s 9.3% 5.4 m 4.9 m
INFO 07:37:09,576 ProgressMeter - chr1:46723985 4.67e+07 60.0 s 1.0 s 18.7% 5.3 m 4.3 m
INFO 07:37:39,577 ProgressMeter - chr1:77397933 7.74e+07 90.0 s 1.0 s 31.1% 4.8 m 3.3 m
INFO 07:38:09,579 ProgressMeter - chr1:110278721 1.10e+08 120.0 s 1.0 s 44.2% 4.5 m 2.5 m
INFO 07:38:39,581 ProgressMeter - chr1:152676713 1.53e+08 2.5 m 0.0 s 61.3% 4.1 m 94.0 s
INFO 07:39:09,582 ProgressMeter - chr1:178887513 1.79e+08 3.0 m 1.0 s 71.8% 4.2 m 70.0 s
INFO 07:39:39,583 ProgressMeter - chr1:208523469 2.09e+08 3.5 m 1.0 s 83.7% 4.2 m 41.0 s
INFO 07:40:09,585 ProgressMeter - chr1:237878013 2.38e+08 4.0 m 1.0 s 95.4% 4.2 m 11.0 s
INFO 07:40:21,523 ProgressMeter - done 2.49e+08 4.2 m 1.0 s 100.0% 4.2 m 0.0 s
INFO 07:40:21,523 ProgressMeter - Total runtime 251.95 secs, 4.20 min, 0.07 hours
INFO 07:40:21,524 MicroScheduler - 205965 reads were filtered out during the traversal out of approximately 7054051 total reads (2.92%)
INFO 07:40:21,524 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 07:40:21,524 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter
INFO 07:40:21,524 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
INFO 07:40:21,524 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 07:40:21,525 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 07:40:21,525 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
INFO 07:40:21,525 MicroScheduler - -> 205965 reads (2.92% of total) failing MappingQualityZeroFilter
INFO 07:40:21,525 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
INFO 07:40:21,525 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter
INFO 07:40:21,525 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
INFO 07:40:43,076 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
INFO 07:40:43,077 HttpMethodDirector - Retrying request
GATK RealignerTargetCreator end: Wed Oct 30 07:40:52 MST 2013

Thanks

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    It looks like your analysis itself completed successfully, but the Phone Home remote logging failed. This is not a problem for your work, but it can be annoying. If you are often running GATK on a machine that does not have internet access, you can request a key to disable Phone Home. This is explained in the FAQ section of our documentation.

  • SANDEEPGUPTASANDEEPGUPTA CHANDLERMember

    Thanks for providing the key file. I am still running into the same error, not sure why..

    GATK RealignerTargetCreator begin: Thu Oct 31 04:02:33 MST 2013
    INFO 04:02:35,547 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed to be BED
    INFO 04:02:35,569 HelpFormatter - --------------------------------------------------------------------------------
    INFO 04:02:35,569 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51
    INFO 04:02:35,569 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 04:02:35,569 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 04:02:35,572 HelpFormatter - Program Args: -T RealignerTargetCreator -K Sandeep.r.gupta_Intel.com.key -R /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/ucsc.hg19.fasta -I /ssd-sdb1/sandeep/work/data/HT021_Normal_exome_ACAGTG.chr1.sorted.bam -L /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed -o /ssd-sdb1/sandeep/work/data/forIndelRealigner.chr1.intervals -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
    INFO 04:02:35,573 HelpFormatter - Date/Time: 2013/10/31 04:02:35
    INFO 04:02:35,573 HelpFormatter - --------------------------------------------------------------------------------
    INFO 04:02:35,573 HelpFormatter - --------------------------------------------------------------------------------
    INFO 04:02:35,577 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO 04:02:35,578 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf to be VCF
    INFO 04:02:36,014 GenomeAnalysisEngine - Strictness is SILENT
    INFO 04:02:36,175 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO 04:02:36,182 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 04:02:36,197 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
    INFO 04:02:36,214 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO 04:02:36,287 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
    INFO 04:02:36,310 IntervalUtils - Processing 249250620 bp from intervals
    INFO 04:02:36,364 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
    INFO 04:02:36,799 GenomeAnalysisEngine - Done preparing for traversal
    INFO 04:02:36,799 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 04:02:36,799 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
    INFO 04:03:06,803 ProgressMeter - chr1:21030873 2.10e+07 30.0 s 1.0 s 8.4% 5.9 m 5.4 m
    INFO 04:03:36,805 ProgressMeter - chr1:42739173 4.27e+07 60.0 s 1.0 s 17.1% 5.8 m 4.8 m
    INFO 04:04:06,806 ProgressMeter - chr1:67347841 6.73e+07 90.0 s 1.0 s 27.0% 5.6 m 4.1 m
    INFO 04:04:36,807 ProgressMeter - chr1:95744113 9.57e+07 120.0 s 1.0 s 38.4% 5.2 m 3.2 m
    INFO 04:05:06,808 ProgressMeter - chr1:124293761 1.21e+08 2.5 m 1.0 s 49.9% 5.0 m 2.5 m
    INFO 04:05:36,809 ProgressMeter - chr1:158297825 1.58e+08 3.0 m 1.0 s 63.5% 4.7 m 103.0 s
    INFO 04:06:06,810 ProgressMeter - chr1:183208489 1.83e+08 3.5 m 1.0 s 73.5% 4.8 m 75.0 s
    INFO 04:06:36,811 ProgressMeter - chr1:209771653 2.10e+08 4.0 m 1.0 s 84.2% 4.8 m 45.0 s
    INFO 04:07:16,812 ProgressMeter - chr1:244627621 2.45e+08 4.7 m 1.0 s 98.1% 4.8 m 5.0 s
    INFO 04:07:23,580 ProgressMeter - done 2.49e+08 4.8 m 1.0 s 100.0% 4.8 m 0.0 s
    INFO 04:07:23,581 ProgressMeter - Total runtime 286.78 secs, 4.78 min, 0.08 hours
    INFO 04:07:23,581 MicroScheduler - 205965 reads were filtered out during the traversal out of approximately 7054051 total reads (2.92%)
    INFO 04:07:23,581 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
    INFO 04:07:23,581 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter
    INFO 04:07:23,581 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
    INFO 04:07:23,582 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
    INFO 04:07:23,582 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
    INFO 04:07:23,582 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
    INFO 04:07:23,582 MicroScheduler - -> 205965 reads (2.92% of total) failing MappingQualityZeroFilter
    INFO 04:07:23,582 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
    INFO 04:07:23,582 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter
    INFO 04:07:23,582 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
    INFO 04:07:48,180 HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
    INFO 04:07:48,180 HttpMethodDirector - Retrying request
    GATK RealignerTargetCreator end: Thu Oct 31 04:07:54 MST 2013

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    You have to also add --phone_home NO_ET to actually deactivate remote logging. Otherwise you're just passing in a key but not instructing GATK to actually use it.

  • SANDEEPGUPTASANDEEPGUPTA CHANDLERMember

    Sorry for the inconvienice, still running into issue.. I tried two option --phone-home NO_ET and also -et NO_ET and below are outputs.

    GATK RealignerTargetCreator begin: Thu Oct 31 05:45:25 MST 2013
    INFO 05:45:27,173 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed to be BED
    INFO 05:45:27,196 HelpFormatter - --------------------------------------------------------------------------------
    INFO 05:45:27,196 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51
    INFO 05:45:27,196 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 05:45:27,196 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 05:45:27,200 HelpFormatter - Program Args: -T RealignerTargetCreator -et NO_ET -K /ssd-sdb1/sandeep/GATK-2-7-4/GenomeAnalysisTK-2.7-4-g6f46d11/Sandeep.r.gupta_Intel.com.key -R /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/ucsc.hg19.fasta -I /ssd-sdb1/sandeep/work/data/HT021_Normal_exome_ACAGTG.chr1.sorted.bam -L /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed -o /ssd-sdb1/sandeep/work/data/forIndelRealigner.chr1.intervals -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
    INFO 05:45:27,200 HelpFormatter - Date/Time: 2013/10/31 05:45:27
    INFO 05:45:27,200 HelpFormatter - --------------------------------------------------------------------------------
    INFO 05:45:27,200 HelpFormatter - --------------------------------------------------------------------------------
    INFO 05:45:27,209 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO 05:45:27,210 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf to be VCF
    INFO 05:45:27,603 GenomeAnalysisEngine - Strictness is SILENT
    INFO 05:45:27,768 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO 05:45:27,774 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 05:45:27,787 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
    INFO 05:45:27,801 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO 05:45:27,860 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
    INFO 05:45:27,879 IntervalUtils - Processing 249250620 bp from intervals
    INFO 05:45:27,928 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
    INFO 05:45:28,267 GenomeAnalysisEngine - Done preparing for traversal
    INFO 05:45:28,267 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 05:45:28,267 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
    INFO 05:45:58,274 ProgressMeter - chr1:21151661 2.11e+07 30.0 s 1.0 s 8.5% 5.9 m 5.4 m
    INFO 05:46:28,276 ProgressMeter - chr1:42991533 4.30e+07 60.0 s 1.0 s 17.2% 5.8 m 4.8 m
    INFO 05:46:58,278 ProgressMeter - chr1:67520265 6.75e+07 90.0 s 1.0 s 27.1% 5.5 m 4.0 m
    INFO 05:47:28,280 ProgressMeter - chr1:95827633 9.58e+07 120.0 s 1.0 s 38.4% 5.2 m 3.2 m
    INFO 05:47:58,281 ProgressMeter - chr1:124192261 1.21e+08 2.5 m 1.0 s 49.8% 5.0 m 2.5 m
    INFO 05:48:28,282 ProgressMeter - chr1:158151969 1.58e+08 3.0 m 1.0 s 63.5% 4.7 m 103.0 s
    INFO 05:48:58,283 ProgressMeter - chr1:182852341 1.83e+08 3.5 m 1.0 s 73.4% 4.8 m 76.0 s
    INFO 05:49:28,285 ProgressMeter - chr1:208974137 2.09e+08 4.0 m 1.0 s 83.8% 4.8 m 46.0 s
    INFO 05:49:58,286 ProgressMeter - chr1:234766253 2.35e+08 4.5 m 1.0 s 94.2% 4.8 m 16.0 s
    INFO 05:50:16,269 ProgressMeter - done 2.49e+08 4.8 m 1.0 s 100.0% 4.8 m 0.0 s
    INFO 05:50:16,269 ProgressMeter - Total runtime 288.00 secs, 4.80 min, 0.08 hours
    INFO 05:50:16,270 MicroScheduler - 205965 reads were filtered out during the traversal out of approximately 7054051 total reads (2.92%)
    INFO 05:50:16,270 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
    INFO 05:50:16,270 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter
    INFO 05:50:16,270 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
    INFO 05:50:16,270 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
    INFO 05:50:16,270 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
    INFO 05:50:16,270 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
    INFO 05:50:16,271 MicroScheduler - -> 205965 reads (2.92% of total) failing MappingQualityZeroFilter
    INFO 05:50:16,271 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
    INFO 05:50:16,271 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter
    INFO 05:50:16,271 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
    GATK RealignerTargetCreator end: Thu Oct 31 05:50:16 MST 2013

    GATK RealignerTargetCreator begin: Thu Oct 31 05:35:53 MST 2013
    INFO 05:35:55,944 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed to be BED
    INFO 05:35:55,966 HelpFormatter - --------------------------------------------------------------------------------
    INFO 05:35:55,967 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51
    INFO 05:35:55,967 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 05:35:55,967 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 05:35:55,970 HelpFormatter - Program Args: -T RealignerTargetCreator --phone_home NO_ET -K /ssd-sdb1/sandeep/GATK-2-7-4/GenomeAnalysisTK-2.7-4-g6f46d11/Sandeep.r.gupta_Intel.com.key -R /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/ucsc.hg19.fasta -I /ssd-sdb1/sandeep/work/data/HT021_Normal_exome_ACAGTG.chr1.sorted.bam -L /ssd-sdb1/sandeep/work/scripts/hg19.chr1.bed -o /ssd-sdb1/sandeep/work/data/forIndelRealigner.chr1.intervals -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf -known /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
    INFO 05:35:55,970 HelpFormatter - Date/Time: 2013/10/31 05:35:55
    INFO 05:35:55,970 HelpFormatter - --------------------------------------------------------------------------------
    INFO 05:35:55,970 HelpFormatter - --------------------------------------------------------------------------------
    INFO 05:35:55,979 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO 05:35:55,980 ArgumentTypeDescriptor - Dynamically determined type of /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf to be VCF
    INFO 05:35:56,376 GenomeAnalysisEngine - Strictness is SILENT
    INFO 05:35:56,529 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO 05:35:56,536 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 05:35:56,550 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
    INFO 05:35:56,564 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO 05:35:56,620 RMDTrackBuilder - Loading Tribble index from disk for file /ssd-sdb1/sandeep/ScrippsData/reference_genome/hg19/1000G_phase1.indels.hg19.vcf
    INFO 05:35:56,639 IntervalUtils - Processing 249250620 bp from intervals
    INFO 05:35:56,689 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
    INFO 05:35:57,028 GenomeAnalysisEngine - Done preparing for traversal
    INFO 05:35:57,028 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 05:35:57,028 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
    INFO 05:36:27,033 ProgressMeter - chr1:20945653 2.09e+07 30.0 s 1.0 s 8.4% 5.9 m 5.4 m
    INFO 05:36:57,034 ProgressMeter - chr1:42351457 4.23e+07 60.0 s 1.0 s 17.0% 5.9 m 4.9 m
    INFO 05:37:27,035 ProgressMeter - chr1:66820553 6.68e+07 90.0 s 1.0 s 26.8% 5.6 m 4.1 m
    INFO 05:37:57,036 ProgressMeter - chr1:94930197 9.49e+07 120.0 s 1.0 s 38.1% 5.3 m 3.3 m
    INFO 05:38:37,037 ProgressMeter - chr1:142537161 1.21e+08 2.7 m 1.0 s 57.2% 4.7 m 119.0 s
    INFO 05:39:07,038 ProgressMeter - chr1:165216173 1.65e+08 3.2 m 1.0 s 66.3% 4.8 m 96.0 s
    INFO 05:39:47,039 ProgressMeter - chr1:201282441 2.01e+08 3.8 m 1.0 s 80.8% 4.7 m 54.0 s
    INFO 05:40:27,040 ProgressMeter - chr1:234550461 2.35e+08 4.5 m 1.0 s 94.1% 4.8 m 16.0 s
    INFO 05:40:45,183 ProgressMeter - done 2.49e+08 4.8 m 1.0 s 100.0% 4.8 m 0.0 s
    INFO 05:40:45,184 ProgressMeter - Total runtime 288.16 secs, 4.80 min, 0.08 hours
    INFO 05:40:45,184 MicroScheduler - 205965 reads were filtered out during the traversal out of approximately 7054051 total reads (2.92%)
    INFO 05:40:45,184 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
    INFO 05:40:45,184 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter
    INFO 05:40:45,184 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
    INFO 05:40:45,184 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
    INFO 05:40:45,185 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
    INFO 05:40:45,185 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
    INFO 05:40:45,185 MicroScheduler - -> 205965 reads (2.92% of total) failing MappingQualityZeroFilter
    INFO 05:40:45,185 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
    INFO 05:40:45,185 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter
    INFO 05:40:45,185 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
    GATK RealignerTargetCreator end: Thu Oct 31 05:40:45 MST 2013

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    What is the issue exactly? It looks like the runs are completing successfully.

  • SANDEEPGUPTASANDEEPGUPTA CHANDLERMember

    Issue is

    failing BadcigarFilter, BadMateFilter etc.

    Not sure if that's normal to be expected.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Oh that's completely normal; there's always going to be a subset of sequence reads that are messed up for some reason, and can't be used informatively. GATK filters them out to avoid crashing on them. If you see a high proportion of reads failing filters, you should run some QC to find out where the issues come from; but a few percent like what you're seeing is completely fine and no reason for alarm.

  • SANDEEPGUPTASANDEEPGUPTA CHANDLERMember

    Thank you so much.. Appreciate your help!

Sign In or Register to comment.