It looks like you're new here. If you want to get involved, click one of these buttons!
java -jar GenomeAnalysisTK.jar -R exampleFASTA.fasta -I exampleBAM.bam -T RealignerTargetCreator -o exampleTarget.intervals
I used the example files, and I got no error report, but in the output, there is no data, that is to say. The exampleTarget.intervals is empty.
Why is this happening?
And when I add --known dbsnp_135.hg19.vcf( that is in the bundle), it will get an error
Geraldine_VdAuwera
Posts: 2,239 admin
OK, I have good news -- you're not doing anything wrong. I checked the example file; the reason you're not getting any SNPs is that there aren't any in there. This example file was intended only to test the very simple walkers in the "first time GATK" tutorials; it's not appropriate for use with more advanced tools like the UnifiedGenotyper. I'm sorry that wasn't clear in the documentation, I'll try to correct that. To test the UG, you can use the file called NA12878.HiSeq.WGS.bwa.cleaned.recal.hg19.20.bam in the bundle. That contains a dataset for chromosome 20. You will get plenty of SNPs there.
Geraldine Van der Auwera, PhD
Answers
Regarding your first problem, it looks like exampleBAM.bam does not have any problem regions that need to be realigned...
For your second problem, that's because the example files are aligned to the b37 reference, but you're using the hg19 version of the dbsnp file. See the FAQs for why the b37/hg19 stuff is important.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks very much.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •I have got another problem for calling SNPs. I can't get SNPs in the example.vcf.
And it is all the same either with
-stand_call_conf 50.0 -stand_emit_conf 10.0 -dcov 1000or not.And I can't get the result either with my own data.
Thanks.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Can you tell me a little more about your problem? Specifically, I need to know what happens when you run the command. Is there no output file created, or is the file empty? Do you get any error messages?
By the way, we recently released version 2.2, so I recommend you upgrade -- the tools are significantly faster in the new version.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks for your answer. As for your questions, it creates the example.vcf file. And in the file it contains the annotation line. The final line in the example.vcf file is: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT exampleBAM But there is no error messages report to me. And I will upgrade the GATK, and have a try again.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •java -jar ~/GenomeAnalysisTK-2.1-5/GenomeAnalysisTK.jar -T UnifiedGenotyper -I exampleBAM.bam -R exampleFASTA.fasta --dbsnp exampleSNP.vcf -o example.vcf
INFO 21:58:41,035 HelpFormatter - -------------------------------------------------------------------------------- INFO 21:58:41,036 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.1-5-gf3daab0, Compiled 2012/08/27 17:19:33 INFO 21:58:41,036 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 21:58:41,036 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 21:58:41,036 HelpFormatter - Program Args: -T UnifiedGenotyper -I exampleBAM.bam -R exampleFASTA.fasta --dbsnp exampleSNP.vcf -o example.vcf INFO 21:58:41,037 HelpFormatter - Date/Time: 2012/11/04 21:58:41 INFO 21:58:41,037 HelpFormatter - -------------------------------------------------------------------------------- INFO 21:58:41,037 HelpFormatter - -------------------------------------------------------------------------------- INFO 21:58:41,054 ArgumentTypeDescriptor - Dynamically determined type of exampleSNP.vcf to be VCF INFO 21:58:41,061 GenomeAnalysisEngine - Strictness is SILENT INFO 21:58:41,130 SAMDataSource$SAMReaders - Initializing SAMRecords in serial WARNING: BAM index file /home/livefallfly/chenlong/GenomeAnalysisTK-2.1-5/resources/exampleBAM.bam.bai is older than BAM /home/livefallfly/chenlong/GenomeAnalysisTK-2.1-5/resources/exampleBAM.bam INFO 21:58:41,140 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 21:58:41,148 RMDTrackBuilder - Loading Tribble index from disk for file exampleSNP.vcf WARN 21:58:41,234 VCFStandardHeaderLines$Standards - Repairing standard header line for field GQ because -- type disagree; header has Float but standard is Integer INFO 21:58:41,581 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING] INFO 21:58:41,581 TraversalEngine - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 21:58:42,666 TraversalEngine - Total runtime 1.17 secs, 0.02 min, 0.00 hours INFO 21:58:42,667 TraversalEngine - 0 reads were filtered out during traversal out of 33 total (0.00%) INFO 21:58:44,838 GATKRunReport - Uploaded run statistics report to AWS S3
This is the screen output with no error.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •I have tried the new version. But it is the same question. It created the example.vcf file with the annotation infomation. Below is the screen output
And next is the example.vcf
And no SNP, I don't why.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks for your help. It works well now. And I have another question, IndelGenotyperV2 is not used again, so which replace the function? And makeIndelMask.py is in the GATK? where can I get it. if not ,which software can I get it?
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •It sounds like you're following a really old pipeline example. Most of the tools have changed -- for example, now the variant calling is done by either the UnifiedGenotyper or the HaplotypeCaller. Please read the Best Practices document, it will tell you what tools to use and in what order:
http://www.broadinstitute.org/gatk/guide/article?id=1186
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •It's very helpful. Thanks very much!
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Hi, I'm trying to use the RealignerTargetCreator and I got empty Target.intervals files. I read that maybe it means that no regions need to be realigned. But then when I used the IndelRealigner tool with my empty Target.intervals files I got empty BAMs. I thought I should have gotten the same BAM (the one used as INPUT file) as there is nothing to realign? Why is it not the case?
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Sounds like something is wrong with your data files. Have you validated your bam file? And if you try using PrintReads on a small interval, does it generate a small bam properly?
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Actually, if I do not use the option -L my_selected_scaff when I'm using PrintReads I got this error message : Badly formed genome loc: Parameters to GenomeLocParser are incorrect:The contig index 23924 is greater than the stored sequence count (1)
Actually I created a new bam only with the reads mapping on my_selected_scaff with this command: samtools view -h my_bam "gi|15616630|ref|NC_002528.1|" | samtools view -bS -> my_new_bam But I still have in the header a line for each scaff I had before. I runned then RealignerTargetCreator and PrintReads reads on this BAM.
If I use the option -L "gi|15616630|ref|NC_002528.1|" I do not have any error but I got at the end an empty bam.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Something is definitely wrong with your new bam file. You should re-create it from the original file using PrintReads with the -L option. Keep in mind that the -L option does not use quotes for contig names.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Then I tried without quotes but I got an error because the tool is looking for a contig only named gi. So I tried
-L gi\|15616630\|ref\|NC_002528.1\|and I got the same as before : no error message but no result in my final BAM. I really don't understand what's happening here. Anyway, as I got an empty interval file for realignment, can I suppose that I do not need to realign and then I can keep working on my initial BAM?- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •You would be better off simplifying the contig names, eg here change it to just NC_002528.1.
Is that a very small contig? If so it's possible that the RTC is not finding anything that needs to be realigned. I would recommend doing one more try with PrintReads on that contig and make sure that the data in the new file is the same as the old one. Then you can move on to the next processing step.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •I tried to change the scaffold name but when I use PrintReads to select the scaff I want to work on, it doesn't print anything in my BAM file but the header. But with this command : samtool view -h mybam.bam buch_scaff | samtools view -bS -> mybam_buchnera.bam with buch_scaff the name of the scaff for which I want to retrieve the reads mapping on it. The new name doesn't change anything about the realignment. My interval file is still empty. The length of the scaffold is 600 kb. Maybe it's not short enough to think that there is no need of a realignment step...?
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •@nanac, if PrintReads isn't doing the right thing, there's a problem you need to fix before you can move on. You should check if the contigs are in the exact same order in your reference as in your bam file. If you changed the reference ordering after alignment that would explain the problem. If the contigs are not in the same order the GATK may fail to find the right contig and think there are no reads for that contig. We are working on improving the way the GATK handles this case, but in the meantime you need to check your files.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks! I think I solve my problem. You were right, I had a problem in my reference file! Everything is working fine now!
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •