Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Why I couldn't download vcf files from public FTP servern any more?

Lisa0508Lisa0508 Ann Arbor, MIMember

Dear GATK team,
I got my raw variant vcf output. Now I want to perform VQSR with those raw variants. Some VCF files in the resource bundle need to be downloaded for variant recalibration. I don't know why I can't access to the resource today. When I click into the "our public FTP server" link. Nothing showed up. It's quite weird because I just downloaded some dbSNP VCF files yesterday without even log-in. It didn't ask me for Username and Password. Sorry about the trouble. Looking forward to your reply.

Best Answers


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    We're currently experiencing some technical difficulties and hope to have the FTP back to normal working order very soon. Sorry for the inconvenience.

  • Lisa0508Lisa0508 Ann Arbor, MIMember

    Thank you very much for your answer! I have tried the password at that thread. It was not correct though.
    May I ask you another question?
    You recommended to use both "Mills_and_1000G_gold_standard.indels.b37.sites.vcf and 1000G_phase1.indels.b37.vcf" for RealignerTargetCreator or IndelRealigner. Is it possible to include both site at one run? I found if I wrote the command as follows trying to include two known sites at once. It would report error. So I used only the "Mills_and_1000G_gold_standard.indels.hg19.sites.vcf" for -known option.Was it enough just using one data set for calibration then? Thank you again

    java -Xmx4g -jar $GATK_JARS/GenomeAnalysisTK.jar \
    -T IndelRealigner \
    -R /home2/Human_genome_reference/ucsc.hg19.fasta \
    -I /home2/test_07242015/Dedup_aligned_TKDNFE_012_ver2.sorted.bam \
    -targetIntervals /home2/test_07242015/target_interval_Dedup_aligned_TKDNFE_012.list \
    -known /home2/zhanghon/Human_genome_reference/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \
    -known /home2/Human_genome_reference/1000G_phase1.indels.b37.vcf \
    -o Realigned_dedup_aligned_TKDNFE_012_ver2.sorted.bam

    Have a nice day,
    :smile: Lisa

  • Lisa0508Lisa0508 Ann Arbor, MIMember

    Thank you very much! The command line worked well.

Sign In or Register to comment.