Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

DepthOfCoverage multiple samples

adri_somavillaadri_somavilla EdinburghMember
edited August 2016 in Ask the GATK team

Dear all,
I have about 300 bam files (whole-genome sequence) and I'm trying to get the DepthOfCoverage per chomosome.

I'm using a script to submit a job for each chromosome, like that:

java -jar /GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -T DepthOfCoverage -R 'ref_gen.fa' -I samplesBam.list -L $Chr -o $Chr.DepthCov -nt 30 -omitIntervals -omitSampleSummary -omitLocusTable

The analysis started without problems, but after a while a got an error about "many files opened".
Sorry for not having the output, but I deleted before realise I was going to need them here.

Is there a better way to do that? I've tried sample by sample, but them I'll have 300x34 chromosomes files to get the average manually.
Thank you

Best Answer

Answers

  • adri_somavillaadri_somavilla EdinburghMember
    edited August 2016
  • adri_somavillaadri_somavilla EdinburghMember

    Thanks Sheila, I'll have a look!!

  • adri_somavillaadri_somavilla EdinburghMember

    @Sheila
    For information:
    I changed the 'open file' limit but it didn't work, so I'm running (version 3.5) without the -nt and -omitIntervals flags:

    java -jar GenomeAnalysisTK.jar -T DepthOfCoverage -R ref_gen -I samplesbam.list -L Chrom -o Chrom.DepthCov -omitSampleSummary -omitLocusTable

    It's working better than before, since it is writing the output now. Also, the estimated time isn't that bad for 263 WGS bam files:
    This is the smallest Chromosome, I'll see what happens to the largest one.

    INFO 11:46:02,855 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 11:46:02,855 ProgressMeter - | processed | time | per 1M | | total | remaining
    INFO 11:46:02,856 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
    INFO 11:46:32,861 ProgressMeter - CM000112.4:26185 16384.0 30.0 s 30.5 m 0.2% 4.5 h 4.5 h
    INFO 11:47:02,866 ProgressMeter - CM000112.4:59953 49152.0 60.0 s 20.3 m 0.4% 3.9 h 3.9 h
    INFO 11:47:32,872 ProgressMeter - CM000112.4:98221 81920.0 90.0 s 18.3 m 0.7% 3.6 h 3.6 h
    INFO 11:48:02,875 ProgressMeter - CM000112.4:131073 131072.0 120.0 s 15.3 m 0.9% 3.6 h 3.6 h
    INFO 11:48:32,879 ProgressMeter - CM000112.4:173241 163840.0 2.5 m 15.3 m 1.2% 3.4 h 3.4 h
    INFO 11:49:02,882 ProgressMeter - CM000112.4:212409 196608.0 3.0 m 15.3 m 1.5% 3.3 h 3.3 h
    INFO 11:49:32,886 ProgressMeter - CM000112.4:245861 245760.0 3.5 m 14.2 m 1.7% 3.3 h 3.3 h
    INFO 11:50:02,911 ProgressMeter - CM000112.4:289529 278528.0 4.0 m 14.4 m 2.1% 3.2 h 3.2 h
    INFO 11:50:32,914 ProgressMeter - CM000112.4:327597 311296.0 4.5 m 14.5 m 2.3% 3.2 h 3.2 h
    INFO 11:51:02,920 ProgressMeter - CM000112.4:363649 360448.0 5.0 m 13.9 m 2.6% 3.2 h 3.1 h
    INFO 11:51:32,927 ProgressMeter - CM000112.4:400517 393216.0 5.5 m 14.0 m 2.8% 3.2 h 3.1 h

Sign In or Register to comment.