Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GATK keys

Hi,
Under your guild, I should rebuild the *.vcf using GATK tool. But I haven't the GATK keys, where the previous *.vcf was built by my tutor.
Therefore , I submit the request:

Request summary below

Name: shengyan su

Email: [email protected]

Reason: I want to use the gatk to produce the vcf file from the bam
files. I do the research on genome scale evolutionary.
The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled
2014/10/24 01:07:22
Copyright (c) 2010 The Broad Institute

Date: Wed, 08 Jul 2015 09:01:08 -0400

So far, I haven't receive the key, can you help me?
Thanks!

Best Answers

Answers

  • courseouhai@gmail.com[email protected] danmarkMember

    @Geraldine_VdAuwera
    Hi,
    When I run
    "java –Xmx8G -jar /home/course/GenomeAnalysisTK.jar -T UnifiedGenotyper -K /home/course/susy_birc.au.dk.key -et NO_ET -R Ursus_maritimus.scaf.fa -I aln.sorted.dup.bam -o aln.vcf"
    command, it just produce the error without any information following the whole scaffolds list.
    My bam files is like this:
    @SQ SN:scaffold72214 LN:3
    @SQ SN:scaffold48 LN:15825002
    @SQ SN:scaffold15 LN:24260673
    @SQ SN:scaffold130 LN:4947231
    @SQ SN:scaffold38 LN:18037018
    @SQ SN:scaffold37 LN:18322464
    @SQ SN:scaffold464 LN:40579
    @SQ SN:scaffold3053 LN:1261
    @SQ SN:scaffold295 LN:406295
    @SQ SN:scaffold173 LN:2975834
    @SQ SN:scaffold463 LN:40618
    @SQ SN:scaffold90 LN:8163341
    @SQ SN:scaffold94 LN:7798381
    @SQ SN:scaffold111 LN:6016729
    @SQ SN:scaffold77 LN:10252110
    @SQ SN:scaffold31 LN:10693394
    @RG ID:1 PL:illumina PU:barcode LB:Library SM:km02
    @PG ID:bwa PN:bwa VN:0.7.12-r1039 CL:/home/course/mauve_2.3.1/linux-x64/bwa-0.7.12/bwa mem -t 16 -p -M Ursus_maritimus.scaf.fa SRR1135309_1.fastq SRR1135309_2.fastq
    SRR1135309.60446863 16 scaffold79 305 60 101M * 0 0 AAGAATGGAATGACCATCGTGGGAAATGGAACAAGTATCCTGGACCTATAGGAAGCTTCTCTGGTTGTGATGAATGATGGTTTGAAATCTTTGGTTTTAAG [email protected]IJIJJIIGGIIJJIJJJJJJJJIJHHHHHFFFFFCCC MD:Z:1G9A1T4T17G26A15A7C12A0 RG:Z:1 NM:i:9 AS:i:63 XS:i:0
    @[email protected]@[email protected]>;BCDE>@[email protected]@@CAC MD:Z:0G9A1T4T17G26A15A7C12A1 RG:Z:1 NM:i:9 AS:i:63 XS:i:0
    SRR1135309.145255178 0 scaffold79 307 60 6S95M * 0 0 GCATAAGAATGGAATGACCATCGTGGGAAATGGAACAAGTATCCTGGACCTATAGGAAGCTTCTCTGGTTGTGATGAATGATGGTTTGAAATCTTTGGTTT CCCFFFFFHHHHHJJJJJJIJJJJIIJHIIJJJIJJJJJIIIIJJJJEHIIIIJJJIIIIJJJJJIHIEEHBEEFFFFFEDEEEEEDDDCDDDDDDCC?A< MD:Z:9A1T4T17G26A15A7C9 RG:Z:1 NM:i:7 AS:i:60 XS:i:0
    SRR1135309.32271893 0 scaffold79 307 52 34S67M * 0 0 AGGAGCTCCTAAAGTTTTAAGGAATGAAGCATAAGAATGGAATGACCATCGTGGGAAATGGAACAAGTATCCTGGACCTATAGGAAGCTTCTCTGGTTGTG CCCFFFFFHHHHGJHIJJJJJIJJJJJJJIJII[email protected]CBDD MD:Z:9A1T4T17G26A5 RG:Z:1 NM:i:5 AS:i:42 XS:i:0

    Can you help me!
    Thank you!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @[email protected].com
    Hi,

    Can you please post the exact log output you get when you run that command.

    Thanks,
    Sheila

  • courseouhai@gmail.com[email protected] danmarkMember

    @Sheila
    Hi,
    During the *.bam process, I have several steps, like this:
    1st:
    /home/course/sratoolkit.2.5.0-centos_linux64/bin/fastq-dump SRR1135309.sra
    /home/course/sratoolkit.2.5.0-centos_linux64/bin/fastq-dump --split-3 SRR1135309.sra
    2nd:
    /home/course/mauve_2.3.1/linux-x64/bwa-0.7.12/bwa index -a bwtsw Ursus_maritimus.scaf.fa
    /home/course/mauve_2.3.1/linux-x64/bwa-0.7.12/bwa mem -t 16 -p -M Ursus_maritimus.scaf.fa SRR1135309_1.fastq SRR1135309_2.fastq > aln.sam

    3rd:
    java -jar /home/course/picard-tools-1.74/ViewSam.jar I=aln.sam |head -120
    java -Xmx4G -jar /home/course/picard-tools-1.74/CleanSam.jar I=aln.sam O=aln.clear.sam
    java -Xmx4G -jar /home/course/picard-tools-1.74/SamFormatConverter.jar I=aln.clear.sam O=aln.bam
    java -Xmx4G -jar /home/course/picard-tools-1.74/SamFormatConverter.jar I=aln.clear.sam O=aln.bam
    java -Xmx4G -jar /home/course/picard-tools-1.74/AddOrReplaceReadGroups.jar INPUT=aln.bam OUTPUT=aln.sorted.bam RGID=1 RGLB=Library RGPL=illumina RGPU=barcode RGSM=km02 SORT_ORDER=coordinate CREATE_INDEX=TRUE VALIDATION_STRINGENCY=LENIENT
    java -Xmx4G -jar /home/course/picard-tools-1.74/MarkDuplicates.jar INPUT=aln.sorted.bam OUTPUT=aln.sorted.dup.bam METRICS_FILE=aln.sorted.dedup.metrics
    java -Xmx8G -jar /home/course/picard-tools-1.74/BuildBamIndex.jar INPUT=aln.sorted.dup.bam
    java -Xmx8G -jar /home/course/picard-tools-1.74/ValidateSamFile.jar INPUT=aln.sorted.dup.bam OUTPUT=aln.sorted.dup.Output.txt REFERENCE_SEQUENCE=Ursus_maritimus.scaf.fa MAX_OUTPUT=2000 VALIDATION_STRINGENCY=LENIENT

    4th step:
    java -Xmx4G -jar /home/course/GenomeAnalysisTK.jar -T UnifiedGenotyper -R Ursus_maritimus.scaf.fa -I aln.sorted.dup.bam
    java –Xmx8G -jar /home/course/GenomeAnalysisTK.jar -T UnifiedGenotyper -K /home/course/susy_birc.au.dk.key -et NO_ET
    -log aln.log -R Ursus_maritimus.scaf.fa -I aln.sorted.dup.bam -o aln.vcf

    Can you help me ? Thank!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @[email protected].com
    Hi,

    Why are you setting VALIDATION_STRINGENCY=LENIENT? What happens when you set VALIDATION_STRINGENCY=STRICT in Validate Sam File?

    -Sheila

  • courseouhai@gmail.com[email protected] danmarkMember

    @Sheila
    Hi!
    When I add the command "VALIDATION_STRINGENCY=LENIENT", one of the paired reads was aligned to the end of one chrosome, the other can't be identified by picard and the correspondent error will be produced.
    My *.vcf still can't be produced. Can you help me?

    Thanks!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @[email protected].com
    Hi,

    Those reads should be filtered out automatically by GATK. However, I cannot help you with your Unified Genotyper issue because I do not have the entire log output. The file you posted ends abruptly without throwing an error or showing completion. Can you please post the entire log output you get when you run Unified Genotyper? Do you get an error message? Or, does the tool say it ran to completion and your vcf simply does not contain any information?

    Thanks,
    Sheila

  • courseouhai@gmail.com[email protected] danmarkMember

    @Sheila
    I don't know how to get the entire log output. The command I used the Unified Genotyper is "java –Xmx8G -jar /home/course/GenomeAnalysisTK.jar -T UnifiedGenotyper -K /home/course/susy_birc.au.dk.key -et NO_ET
    -log aln.log -R Ursus_maritimus.scaf.fa -I aln.sorted.dup.bam -o aln.vcf". It just produce the log file I have attached and I haven't got the error message. But I have found some problems in my bam file(enclosed file) and have no idea to solve them.

    In addition, about your suggestion "your vcf simply does not contain any information?", I intend to know what the information includes.
    My idea is that merge the target.vcf my tutor supply and my.vcf together, where they have the same reference.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @[email protected].com
    Hi,

    Unfortunately, I cannot help you if you cannot show that the tool has either run to completion or ended with an error message. The log output you posted ends too abruptly for me to understand what is going on.

    -Sheila

  • courseouhai@gmail.com[email protected] danmarkMember

    @[email protected].com
    I have got the entire log file enclosed. Please help me, I am in hurry! Thank you a lot!

  • courseouhai@gmail.com[email protected] danmarkMember

    @Sheila
    I have got the entire log file enclosed. You know I used the following command to do the alignment.
    /home/course/mauve_2.3.1/linux-x64/bwa-0.7.12/bwa mem -t 16 -p -M Ursus_maritimus.scaf.fa SRR1135309_1.fastq SRR1135309_2.fastq > aln.sam
    I don't know why the same reference genome had the different configs.
    Please help me, I am in hurry! Thank you a lot!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @[email protected].com
    Hi,

    It looks like you are perhaps using a different version of the reference. The error message clearly states there are many more contigs in the reference than in the reads. Can you please post the reference .dict file and the bam header?

    Have a look at this thread for more information: http://gatkforums.broadinstitute.org/discussion/2396/input-files-known-and-reference-have-incompatible-contigs

    Thanks,
    Sheila

  • courseouhai@gmail.com[email protected] danmarkMember

    @Sheila
    I try to realignment the fastaq again using the reference. But I am still failure and I don't know why. The bam header and *.dict were enclosed. Please help me! Thanks!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @[email protected].com
    Hi,

    You attached the error message twice instead of the reference .dict file. According to the error message, the reference you used has many more contigs than the reference your bam file was aligned to. You can maybe try using -XL for the contigs not in your bam file. I can't promise it will work though. https://www.broadinstitute.org/gatk/guide/tooldocs/org_broadinstitute_gatk_engine_CommandLineGATK.php#--excludeIntervals

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @[email protected].com At this point I think you should seek guidance from a more experienced colleague or professor, because your support needs are beyond what we can provide to individual users on this forum. Good luck!

Sign In or Register to comment.