We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Why does Baserecalibrator stop with no error?

First time at 51% through. Same command used got to 32.7%. Then 60.1%.
When I return to computer I havce been kicked out to my $ and no error is displayed.

First time at 51% through. Same command used got to 32.7%. Then 60.1%.....

What may cause this seemingly random eject?

java -Xmx8g -jar /usr/local/bin/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar -T BaseRecalibrator -R human.fa -I OD37.marked.realigned.fixed.bam -knownSites dbsnp_137.b37.vcf -knownSites Mills_and_1000G_gold_standard.indels.b37.sites.vcf -knownSites ALL.wgs.phase1_release_v2.20101123.snps_indels_sv.sites.vcf -o OD37_recal.grp -plots OD37_recal.grp.pdf

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    What platform are you running on? Are you running on a server? Perhaps your connection is failing.

  • FGPonceFGPonce Member

    Thanks Geraldine. No its at home on my PC but I'm not running anything else on it. Set GATK module going and leave it going. I'm following a pipeline from a colleague who sedcribes dbsnp_135 but I downloaded dbsnp137 and used that. I'll try the 135 and see if that has anything to do with it!

  • FGPonceFGPonce Member

    dbsnp135 worked. Perhaps dbsnp137 download/file is weird. My copy is exactly same size as the target file though.........

    No plots though. Presumably thats an R problem. Have used R to plot stuff before so not obvious.

    The following BQSR step however fails to run now!! I'm following the example in the GATK presentations (as well) ie java –jar GenomeAnalysisTK.jar –T BaseRecalibrator –R human.fasta –I realigned.bam –BQSR recal.grp –o post_recal.grp –plots post_recal.grp.pdf

    My version: java -Xmx8g -jar /usr/local/bin/GenomeAnalysisTK-2.3-9-ge5ebf34/GenomeAnalysisTK.jar -T BaseRecalibrator -R human.fa -I OD37.marked.realigned.fixed.bam -BQSR OD37_recal.grp -o OD37_post_recal.grp -plots OD37_post_recal.grp.pdf

    BUT error seems to ask for -knownSites field!!!

    ERROR MESSAGE: Invalid command line: This calculation is critically dependent on being able to skip over known variant sites. Please provide a VCF file containing known sites of genetic variation.
  • malwanamalwana Member

    I tried to to create a dummy vcf file which had only default header and fed to satisfy the -knownSites args. It seems to work but maybe not a good procedure to do. However the final result of snps/indels were being called alright.

  • FGPonceFGPonce Member

    Thanks malwana I'll try that as a work around.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    This is not a good idea -- as stated in the error message, the known sites play a critical part in the recalibration process. If you do not provide known sites, many real sites may be considered errors, and that will skew the recalibration. You will still get calls later on but the quality metrics associated with those calls may not be reliable. Please read the Best Practices and Technical Documentation for more details.

  • FGPonceFGPonce Member

    Thanks Geraldine. But the GATK best practices presentation gives an example command for the step including -BQSR and the -knownSites option is not used!!!!

    They are used in the step prior to this.

  • FGPonceFGPonce Member

    Thanks G I'll do that. Rob

Sign In or Register to comment.