Service Notice: Normal service will resume Thursday 28 Jan. Thanks for your patience.

HaplotypeCaller/UnifiedGenotyper produce no output

biscuit13161biscuit13161 Posts: 13Member

Hi,
I'm using version 2.5-2 of GATK and until recently I was using version 2.3 of the Bundle.

a few weeks ago I ran UnifiedGenotyper on WGS data (converted to Bam from Complete Genomics and processed with GATK) and got my list of variants using hg19 downloaded from UCSC [with chrM at the end] and having modified the bundle dbsnp vcfs to match. However I needed to go back earlier this week and re-run it so I decided to use the v2.5 of the Bundle and HaplotypeCaller.
When I didn't get any output (see below) I tried again with the UnifiedGenotyper.
The command I used for the UnifiedGenotyper is:

java -Xmx24g -Xms24g -jar ~/apps/gatk/GenomeAnalysisTK.jar -I 14972_realigned.bam -I 14973_realigned.bam -R ~/gatk/ucsc.hg19.fasta -T UnifiedGenotyper -o twins.UG.vcf -nt 12 -stand_emit_conf 10.0 -stand_call_conf 30.0 --dbsnp ~/gatk/dbsnp_137.hg19.vcf -glm BOTH -dcov 200 > twins_UG.log 2>&1

using either HaplotypeCaller or UnifiedGenotyper with either the ucsc.hg19.fasta or hg19 [with chrM at the beginning, modified to match the vcfs] I was unable to get any output. Let me clarify that - the output VCF files was produced containing the full header but no variants and the log file looks like this:

INFO 06:34:41,404 HelpFormatter - --------------------------------------------------------------------------------
INFO 06:34:41,411 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
INFO 06:34:41,411 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 06:34:41,411 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 06:34:41,414 HelpFormatter - Program Args: -I 14972_realigned.bam -I 14973_realigned.bam -R /Users/thompsoni/gatk/ucsc.hg19.fasta -T UnifiedGenotyper -o twins.UG.vcf -nt 12 -stand_emit_conf 10.0 -stand_call_conf 30.0 --dbsnp /Users/thompsoni/gatk/dbsnp_137.hg19.vcf -glm BOTH -dcov 200
INFO 06:34:41,415 HelpFormatter - Date/Time: 2013/05/20 06:34:41
INFO 06:34:41,415 HelpFormatter - --------------------------------------------------------------------------------
INFO 06:34:41,415 HelpFormatter - --------------------------------------------------------------------------------
INFO 06:34:41,448 ArgumentTypeDescriptor - Dynamically determined type of /Users/thompsoni/gatk/dbsnp_137.hg19.vcf to be VCF
INFO 06:34:41,580 GenomeAnalysisEngine - Strictness is SILENT
INFO 06:34:41,660 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 200
INFO 06:34:41,666 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:34:41,688 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO 06:34:41,699 RMDTrackBuilder - Creating Tribble index in memory for file /Users/thompsoni/gatk/dbsnp_137.hg19.vcf
INFO 06:38:01,508 RMDTrackBuilder - Writing Tribble index to disk for file /Users/thompsoni/gatk/dbsnp_137.hg19.vcf.idx
INFO 06:38:20,936 MicroScheduler - Running the GATK in parallel mode with 12 total threads, 1 CPU thread(s) for each of 12 data thread(s), of 24 processors available on this machine
INFO 06:38:21,082 GenomeAnalysisEngine - Creating shard strategy for 2 BAM files
INFO 06:38:21,708 GenomeAnalysisEngine - Done creating shard strategy
INFO 06:38:21,708 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 06:38:21,708 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 06:38:21,832 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,842 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
INFO 06:38:21,843 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,877 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03
INFO 06:38:21,878 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,889 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
INFO 06:38:21,890 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,908 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO 06:38:21,909 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,923 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
INFO 06:38:21,924 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,939 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
INFO 06:38:21,940 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:21,957 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO 06:38:21,976 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:22,092 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.12
INFO 06:38:22,118 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:22,242 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.12
INFO 06:38:22,261 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:22,433 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.17
INFO 06:38:22,611 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 06:38:22,704 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.09
INFO 06:38:51,715 ProgressMeter - Starting 0.00e+00 30.0 s 49.6 w 100.0% 30.0 s 0.0 s
INFO 06:39:21,719 ProgressMeter - Starting 0.00e+00 60.0 s 99.2 w 100.0% 60.0 s 0.0 s
INFO 06:39:51,724 ProgressMeter - Starting 0.00e+00 90.0 s 148.8 w 100.0% 90.0 s 0.0 s
INFO 06:40:21,728 ProgressMeter - Starting 0.00e+00 120.0 s 198.4 w 100.0% 120.0 s 0.0 s
INFO 06:40:51,732 ProgressMeter - Starting 0.00e+00 2.5 m 248.1 w 100.0% 2.5 m 0.0 s
INFO 06:41:21,735 ProgressMeter - Starting 0.00e+00 3.0 m 297.7 w 100.0% 3.0 m 0.0 s
INFO 06:41:51,739 ProgressMeter - Starting 0.00e+00 3.5 m 347.3 w 100.0% 3.5 m 0.0 s
INFO 06:42:21,743 ProgressMeter - Starting 0.00e+00 4.0 m 396.9 w 100.0% 4.0 m 0.0 s
INFO 06:42:51,748 ProgressMeter - Starting 0.00e+00 4.5 m 446.5 w 100.0% 4.5 m 0.0 s
INFO 06:43:21,751 ProgressMeter - Starting 0.00e+00 5.0 m 496.1 w 100.0% 5.0 m 0.0 s
INFO 06:43:51,755 ProgressMeter - Starting 0.00e+00 5.5 m 545.7 w 100.0% 5.5 m 0.0 s
INFO 06:44:21,759 ProgressMeter - Starting 0.00e+00 6.0 m 595.3 w 100.0% 6.0 m 0.0 s
INFO 06:44:51,763 ProgressMeter - Starting 0.00e+00 6.5 m 644.9 w 100.0% 6.5 m 0.0 s
INFO 06:45:21,768 ProgressMeter - Starting 0.00e+00 7.0 m 694.5 w 100.0% 7.0 m 0.0 s
INFO 06:45:51,772 ProgressMeter - Starting 0.00e+00 7.5 m 744.2 w 100.0% 7.5 m 0.0 s
INFO 06:46:21,776 ProgressMeter - Starting 0.00e+00 8.0 m 793.8 w 100.0% 8.0 m 0.0 s
INFO 06:46:51,783 ProgressMeter - Starting 0.00e+00 8.5 m 843.4 w 100.0% 8.5 m 0.0 s
INFO 06:47:21,787 ProgressMeter - Starting 0.00e+00 9.0 m 893.0 w 100.0% 9.0 m 0.0 s
INFO 06:47:51,794 ProgressMeter - Starting 0.00e+00 9.5 m 942.6 w 100.0% 9.5 m 0.0 s
INFO 06:48:21,797 ProgressMeter - Starting 0.00e+00 10.0 m 992.2 w 100.0% 10.0 m 0.0 s
INFO 06:48:51,802 ProgressMeter - Starting 0.00e+00 10.5 m 1041.8 w 100.0% 10.5 m 0.0 s
INFO 06:49:21,806 ProgressMeter - Starting 0.00e+00 11.0 m 1091.4 w 100.0% 11.0 m 0.0 s
INFO 06:49:51,810 ProgressMeter - Starting 0.00e+00 11.5 m 1141.0 w 100.0% 11.5 m 0.0 s
INFO 06:50:21,813 ProgressMeter - Starting 0.00e+00 12.0 m 1190.7 w 100.0% 12.0 m 0.0 s
INFO 06:50:51,818 ProgressMeter - Starting 0.00e+00 12.5 m 1240.3 w 100.0% 12.5 m 0.0 s
INFO 06:51:21,822 ProgressMeter - Starting 0.00e+00 13.0 m 1289.9 w 100.0% 13.0 m 0.0 s
INFO 06:51:51,826 ProgressMeter - Starting 0.00e+00 13.5 m 1339.5 w 100.0% 13.5 m 0.0 s
INFO 06:52:21,831 ProgressMeter - Starting 0.00e+00 14.0 m 1389.1 w 100.0% 14.0 m 0.0 s
INFO 06:52:51,835 ProgressMeter - Starting 0.00e+00 14.5 m 1438.7 w 100.0% 14.5 m 0.0 s
INFO 06:53:21,839 ProgressMeter - Starting 0.00e+00 15.0 m 1488.3 w 100.0% 15.0 m 0.0 s
INFO 06:53:51,843 ProgressMeter - Starting 0.00e+00 15.5 m 1537.9 w 100.0% 15.5 m 0.0 s
INFO 06:54:21,857 ProgressMeter - Starting 0.00e+00 16.0 m 1587.5 w 100.0% 16.0 m 0.0 s
INFO 06:54:51,860 ProgressMeter - Starting 0.00e+00 16.5 m 1637.2 w 100.0% 16.5 m 0.0 s
INFO 06:55:21,865 ProgressMeter - Starting 0.00e+00 17.0 m 1686.8 w 100.0% 17.0 m 0.0 s
INFO 06:55:51,869 ProgressMeter - Starting 0.00e+00 17.5 m 1736.4 w 100.0% 17.5 m 0.0 s
INFO 06:56:21,873 ProgressMeter - Starting 0.00e+00 18.0 m 1786.0 w 100.0% 18.0 m 0.0 s
INFO 06:56:51,877 ProgressMeter - Starting 0.00e+00 18.5 m 1835.6 w 100.0% 18.5 m 0.0 s
INFO 06:57:21,882 ProgressMeter - Starting 0.00e+00 19.0 m 1885.2 w 100.0% 19.0 m 0.0 s
INFO 06:57:51,886 ProgressMeter - Starting 0.00e+00 19.5 m 1934.8 w 100.0% 19.5 m 0.0 s
INFO 06:58:21,890 ProgressMeter - Starting 0.00e+00 20.0 m 1984.4 w 100.0% 20.0 m 0.0 s
INFO 06:58:51,894 ProgressMeter - Starting 0.00e+00 20.5 m 2034.0 w 100.0% 20.5 m 0.0 s
INFO 06:59:21,898 ProgressMeter - Starting 0.00e+00 21.0 m 2083.6 w 100.0% 21.0 m 0.0 s
INFO 06:59:51,902 ProgressMeter - Starting 0.00e+00 21.5 m 2133.3 w 100.0% 21.5 m 0.0 s
INFO 07:00:21,906 ProgressMeter - Starting 0.00e+00 22.0 m 2182.9 w 100.0% 22.0 m 0.0 s
INFO 07:00:51,910 ProgressMeter - Starting 0.00e+00 22.5 m 2232.5 w 100.0% 22.5 m 0.0 s
INFO 07:01:21,913 ProgressMeter - Starting 0.00e+00 23.0 m 2282.1 w 100.0% 23.0 m 0.0 s
INFO 07:01:51,917 ProgressMeter - Starting 0.00e+00 23.5 m 2331.7 w 100.0% 23.5 m 0.0 s
INFO 07:02:21,921 ProgressMeter - Starting 0.00e+00 24.0 m 2381.3 w 100.0% 24.0 m 0.0 s
INFO 07:02:51,929 ProgressMeter - Starting 0.00e+00 24.5 m 2430.9 w 100.0% 24.5 m 0.0 s
INFO 07:03:21,933 ProgressMeter - Starting 0.00e+00 25.0 m 2480.5 w 100.0% 25.0 m 0.0 s
INFO 07:03:51,937 ProgressMeter - Starting 0.00e+00 25.5 m 2530.1 w 100.0% 25.5 m 0.0 s
INFO 07:04:21,941 ProgressMeter - Starting 0.00e+00 26.0 m 2579.7 w 100.0% 26.0 m 0.0 s
INFO 07:04:51,945 ProgressMeter - Starting 0.00e+00 26.5 m 2629.4 w 100.0% 26.5 m 0.0 s
INFO 07:05:15,341 ProgressMeter - done 0.00e+00 26.9 m 2668.0 w 100.0% 26.9 m 0.0 s
INFO 07:05:15,341 ProgressMeter - Total runtime 1613.63 secs, 26.89 min, 0.45 hours
INFO 07:05:15,429 MicroScheduler - 1148 reads were filtered out during traversal out of 561526 total (0.20%)
INFO 07:05:15,429 MicroScheduler - -> 1148 reads (0.20% of total) failing DuplicateReadFilter
INFO 07:05:17,930 GATKRunReport - Uploaded run statistics report to AWS S3

Whilst I have managed to rerun UnifiedGenotyper with my unmodified hg19 [chrM at the end] and modified dbsnp vcf, I am curious as to why this should be a problem as I would prefer to use the vcfs and references without modifications.

Answers

  • biscuit13161biscuit13161 Posts: 13Member

    Apologies, the log file isn't very readable in my question so I've attached it here

    log
    log
    twins_UG.log
    11K
  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    This is a clear artifact of the fact that you are moving the MT contig around in your data. It sees the MT contig and thinks it's done.

    You have to use a reference that matches the data. It is very difficult to move things around to make it work (you have to reindex both the reference and the bam -- this is assuming you made the changes correctly).

  • biscuit13161biscuit13161 Posts: 13Member

    Sorry, allow me to clarify myself.

    The BAM input data has no chrM at all. This issue is occuring even when using the unmodified reference and unmodified VCF files from the GATK-Bundle.

Sign In or Register to comment.