Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

Error Stack trace after running SelectVariants

mahyarheymahyarhey BostonPosts: 37Member

I just wanted to select variants from a VCF with 42 samples. After 3 hours I got the following Error. How to fix this? please advise. Thanks I had the same problem when I used "VQSR". How can I fix this problem?

INFO 20:28:17,247 HelpFormatter - -------------------------------------------------------------------------------- INFO 20:28:17,250 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11, Compiled 2013/10/10 17:27:51 INFO 20:28:17,250 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 20:28:17,251 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 20:28:17,255 HelpFormatter - Program Args: -T SelectVariants -rf BadCigar -R /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/ucsc.hg19.fasta -V /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf -L chr1 -L chr2 -L chr3 -selectType SNP -o /hms/scratch1/mahyar/Danny/data/Filter/extract_SNP_only3chr.vcf INFO 20:28:17,256 HelpFormatter - Date/Time: 2014/01/20 20:28:17 INFO 20:28:17,256 HelpFormatter - -------------------------------------------------------------------------------- INFO 20:28:17,256 HelpFormatter - -------------------------------------------------------------------------------- INFO 20:28:17,305 ArgumentTypeDescriptor - Dynamically determined type of /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf to be VCF INFO 20:28:18,053 GenomeAnalysisEngine - Strictness is SILENT INFO 20:28:18,167 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 INFO 20:28:18,188 RMDTrackBuilder - Creating Tribble index in memory for file /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf INFO 23:15:08,278 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NegativeArraySizeException at org.broad.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:97) at org.broad.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:116) at org.broad.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:84) at org.broad.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:73) at net.sf.samtools.util.AbstractIterator.next(AbstractIterator.java:57) at org.broad.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:46) at org.broad.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:24) at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:73) at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:35) at org.broad.tribble.AbstractFeatureCodec.decodeLoc(AbstractFeatureCodec.java:40) at org.broad.tribble.index.IndexFactory$FeatureIterator.readNextFeature(IndexFactory.java:428) at org.broad.tribble.index.IndexFactory$FeatureIterator.next(IndexFactory.java:390) at org.broad.tribble.index.IndexFactory.createIndex(IndexFactory.java:288) at org.broad.tribble.index.IndexFactory.createDynamicIndex(IndexFactory.java:278) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createIndexInMemory(RMDTrackBuilder.java:388) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.loadIndex(RMDTrackBuilder.java:274) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.getFeatureSource(RMDTrackBuilder.java:211) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createInstanceOfTrack(RMDTrackBuilder.java:140) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedQueryDataPool.(ReferenceOrderedDataSource.java:208) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedDataSource.(ReferenceOrderedDataSource.java:88) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.getReferenceOrderedDataSources(GenomeAnalysisEngine.java:964) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.initializeDataSources(GenomeAnalysisEngine.java:758) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:284) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-4-g6f46d11):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    Hi there,

    Can you please check if this also happens with the latest version (2.8)?

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    Hi Geraldine, Regarding your comment, I used the latest version (2.8-1) of GenomeAnalysisTK for VQSR. The problem still exist and the following error came up the same as previous one for SelectVariants after 3 hours. Do you know how I can fix this problem? Thanks

    INFO 11:40:15,811 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:40:15,816 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 11:40:15,816 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 11:40:15,816 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 11:40:15,821 HelpFormatter - Program Args: -T VariantRecalibrator -R /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/ucsc.hg19.fasta --input /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf --resource:dbsnp,VCF,known=false,training=true,truth=true,prior=6.0 /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/dbsnp_137.hg19.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ --mode SNP -rf BadCigar --recal_file /hms/scratch1/mahyar/Danny/data/VQSR/All42_post_VQRS.recal --tranches_file /hms/scratch1/mahyar/Danny/data/VQSR/All42_post_VQRS.tranches --rscript_file /hms/scratch1/mahyar/Danny/data/VQSR/All42_post_VQRS_plots.R INFO 11:40:15,822 HelpFormatter - Date/Time: 2014/01/21 11:40:15 INFO 11:40:15,822 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:40:15,822 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:40:15,879 ArgumentTypeDescriptor - Dynamically determined type of /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf to be VCF INFO 11:40:17,324 GenomeAnalysisEngine - Strictness is SILENT INFO 11:40:17,516 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 INFO 11:40:17,540 RMDTrackBuilder - Creating Tribble index in memory for file /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf INFO 14:38:29,998 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.NegativeArraySizeException at org.broad.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:97) at org.broad.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:116) at org.broad.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:84) at org.broad.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:73) at net.sf.samtools.util.AbstractIterator.next(AbstractIterator.java:57) at org.broad.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:46) at org.broad.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:24) at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:73) at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:35) at org.broad.tribble.AbstractFeatureCodec.decodeLoc(AbstractFeatureCodec.java:40) at org.broad.tribble.index.IndexFactory$FeatureIterator.readNextFeature(IndexFactory.java:428) at org.broad.tribble.index.IndexFactory$FeatureIterator.next(IndexFactory.java:390) at org.broad.tribble.index.IndexFactory.createIndex(IndexFactory.java:288) at org.broad.tribble.index.IndexFactory.createDynamicIndex(IndexFactory.java:278) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createIndexInMemory(RMDTrackBuilder.java:388) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.loadIndex(RMDTrackBuilder.java:274) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.getFeatureSource(RMDTrackBuilder.java:211) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createInstanceOfTrack(RMDTrackBuilder.java:140) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedQueryDataPool.(ReferenceOrderedDataSource.java:208) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedDataSource.(ReferenceOrderedDataSource.java:88) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.getReferenceOrderedDataSources(GenomeAnalysisEngine.java:964) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.initializeDataSources(GenomeAnalysisEngine.java:758) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:284) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.8-1-g932cd3a):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Code exception (see stack trace for error itself)
    ERROR ------------------------------------------------------------------------------------------
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    This could be an issue with your vcf index. Try removing the vcf index file and run again. GATK should generate a new VCF index for you, and work properly. Let me know if it fails again.

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    I removed the vcf index file in the directory and run again. but the problem still exist as follows:

    INFO 15:47:29,664 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:47:29,674 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 15:47:29,674 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 15:47:29,675 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 15:47:29,679 HelpFormatter - Program Args: -T VariantRecalibrator -R /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/ucsc.hg19.fasta --input /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf --resource:dbsnp,VCF,known=false,training=true,truth=true,prior=6.0 /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/dbsnp_137.hg19.vcf -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ --mode SNP -rf BadCigar --recal_file /hms/scratch1/mahyar/Danny/data/VQSR/All42_post_VQRS.recal --tranches_file /hms/scratch1/mahyar/Danny/data/VQSR/All42_post_VQRS.tranches --rscript_file /hms/scratch1/mahyar/Danny/data/VQSR/All42_post_VQRS_plots.R INFO 15:47:29,680 HelpFormatter - Date/Time: 2014/01/21 15:47:29 INFO 15:47:29,680 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:47:29,680 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:47:29,718 ArgumentTypeDescriptor - Dynamically determined type of /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf to be VCF INFO 15:47:30,979 GenomeAnalysisEngine - Strictness is SILENT INFO 15:47:31,094 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 INFO 15:47:31,119 RMDTrackBuilder - Creating Tribble index in memory for file /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.NegativeArraySizeException at org.broad.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:97) at org.broad.tribble.readers.AsciiLineReader.readLine(AsciiLineReader.java:116) at org.broad.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:84) at org.broad.tribble.readers.AsciiLineReaderIterator$TupleIterator.advance(AsciiLineReaderIterator.java:73) at net.sf.samtools.util.AbstractIterator.next(AbstractIterator.java:57) at org.broad.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:46) at org.broad.tribble.readers.AsciiLineReaderIterator.next(AsciiLineReaderIterator.java:24) at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:73) at org.broad.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:35) at org.broad.tribble.AbstractFeatureCodec.decodeLoc(AbstractFeatureCodec.java:40) at org.broad.tribble.index.IndexFactory$FeatureIterator.readNextFeature(IndexFactory.java:428) at org.broad.tribble.index.IndexFactory$FeatureIterator.next(IndexFactory.java:390) at org.broad.tribble.index.IndexFactory.createIndex(IndexFactory.java:288) at org.broad.tribble.index.IndexFactory.createDynamicIndex(IndexFactory.java:278) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createIndexInMemory(RMDTrackBuilder.java:388) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.loadIndex(RMDTrackBuilder.java:274) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.getFeatureSource(RMDTrackBuilder.java:211) at org.broadinstitute.sting.gatk.refdata.tracks.RMDTrackBuilder.createInstanceOfTrack(RMDTrackBuilder.java:140) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedQueryDataPool.(ReferenceOrderedDataSource.java:208) at org.broadinstitute.sting.gatk.datasources.rmd.ReferenceOrderedDataSource.(ReferenceOrderedDataSource.java:88) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.getReferenceOrderedDataSources(GenomeAnalysisEngine.java:964) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.initializeDataSources(GenomeAnalysisEngine.java:758) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:284) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.8-1-g932cd3a):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Code exception (see stack trace for error itself)
    ERROR ------------------------------------------------------------------------------------------
  • mahyarheymahyarhey BostonPosts: 37Member

    Hi Geraldine, did you get my previous message regarding error (stack trace) for VQSR? I used version 2.8-1 and removed the vcf index file as you mentioned, but the error occurred 3 hours after processing my VCF file. I don't know what is the problem. could you please advise. thanks!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    Yes, I got it (last night iirc). Besides the rest of my work, meetings etc., I get a lot of fresh new help requests every morning so I clear out all the really easy or high priority ones (based on development priorities) first, which is why I'm only getting to yours now. I know it's frustrating when you're stuck on something but please don't send additional messages like this; it just clutters up my mailbox and it won't get me here any faster.

    So, to your problem. Based on the stack trace it looks like there's something missing or badly formatted in a field in your VCF somewhere. At this point I'd recommend validating your VCF, either with GATK's VCF validation tool, ValidateVariants or vcftools. This should tell you what the problem is, if it's a VCF issue.

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    Thanks, Can I remove the badly formatted VCF through ValidateVariants?

  • mahyarheymahyarhey BostonPosts: 37Member

    I found the following command for ValidateVariants on your site. How about output? It's automatically create output file in the same directory or I need to add output command in the script?

    java -Xmx2g -jar GenomeAnalysisTK.jar \ -R ref.fasta \ -T ValidateVariants \ --variant input.vcf \ --dbsnp dbsnp.vcf

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    The tool will output results to the console, not to file. I would recommend you also add --warnOnErrors (see argument details).

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    My vcf file size is about 718GB. Do you know how long it will take time for ValidateVariants?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    No, it depends a lot on your computing platform. The tool will give you an estimate once it starts.

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    I used the following script. So, how I can see the output in the console. My job submitted to the "short" queue!

    bsub -q short -W 12:0 -o /hms/scratch1/mahyar/error.log java -jar /GenomeAnalysisTK-2.8-1-g932cd3a/GenomeAnalysisTK.jar \ -T ValidateVariants \ -R /hms/scratch1/mahyar/ucsc.hg19.fasta \ --variant /hms/scratch1/mahyar/Overal-RGSM-42prebamfiles-allsites.vcf \ --dbsnp /hms/scratch1/mahyar/dbsnp_137.hg19.vcf \ --warnOnErrors

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    Presumably you'll find it in the file you specified as output for the bsub job, which is apparently /hms/scratch1/mahyar/error.log

    If you are uncertain as to how your cluster job service works, you should ask your IT support people. We don't provide support for this kind of thing since it's not GATK specific.

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    OK, but after 1 hour running, I can only see the following info in my log. No estimate of time! So, I need to wait more?

    INFO 12:35:02,589 HelpFormatter - -------------------------------------------------------------------------------- INFO 12:35:02,593 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.8-1-g932cd3a, Compiled 2013/12/06 16:47:15 INFO 12:35:02,593 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 12:35:02,593 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 12:35:02,599 HelpFormatter - Program Args: -T ValidateVariants -R /hms/scratch1/mahyar/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/ucsc.hg19.fasta --variant /hms/scratch1/mahyar/Overal-RGSM-42prebamfiles-allsites.vcf --dbsnp /hms/scratch1/dbsnp_137.hg19.vcf --warnOnErrors INFO 12:35:02,599 HelpFormatter - Date/Time: 2014/01/22 12:35:02 INFO 12:35:02,599 HelpFormatter - -------------------------------------------------------------------------------- INFO 12:35:02,600 HelpFormatter - -------------------------------------------------------------------------------- INFO 12:35:02,638 ArgumentTypeDescriptor - Dynamically determined type of /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf to be VCF INFO 12:35:02,672 ArgumentTypeDescriptor - Dynamically determined type of /groups/body/JDM_RNA_Seq-2012/GATK/bundle-2.3/ucsc.hg19/dbsnp_137.hg19.vcf to be VCF INFO 12:35:05,616 GenomeAnalysisEngine - Strictness is SILENT INFO 12:35:05,897 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 INFO 12:35:05,927 RMDTrackBuilder - Creating Tribble index in memory for file /hms/scratch1/mahyar/Danny/data/Overal-RGSM-42prebamfiles-allsites.vcf

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    Hmm, that's not normal. Is it still stuck at that point? How large is the file, by the way?

    Geraldine Van der Auwera, PhD

  • mahyarheymahyarhey BostonPosts: 37Member

    Yes, it is stuck! The VCF file size is 718GB.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,227Administrator, GATK Developer admin

    Oh that's really quite large, that explains why it's taking so long. The index for that will be ~1Gb, so it might take a fairly long time to create it. Also, you'll need to allocate more memory to the JVM. You can do that by adding e.g. -Xmx4g to your command (between java and -jar).

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.