Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Input bam file for the CNN pipeline

Hi,

I was looking to test the CNN pipeline and I saw that in the HC step was added the "-bamout" option (HC). To run the 2D model in the CNNScoreVariants tool I have to use as input bam file the original bam (post BQSR bam) or the bamout bam from the HC step?

Thanks

Best Answer

Answers

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin
    edited February 15

    @manolis

    I believe that this is correct, that the post-BQSR reads are input into the Haplotype caller. It can be either the bam used for HaplotypeCaller, or the bamout.

  • manolismanolis Member ✭✭
    edited February 16

    ok, is there any logical criteria to select the bam used for HC or the bamout of HC, as input in the CCNScoreVariants? Usually those two bam files are differents ...

    Moreove,

    I have this error:

    /share/apps/bio/gatk-4.1.0.0/gatk --java-options "-Xmx20g -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -XX:ConcGCThreads=1 -XX:ParallelGCThreads=2" CNNScoreVariants \
    -I ${HC.bamout.bam} \
    -R ${hg38} \
    -V ${HC.vcf.gz} \
    -O ${CNN.vcf.gz} \
    -L ${intervals} \
    --inference-batch-size 8 \
    --transfer-batch-size 32 \
    --tensor-type read_tensor \
    --tmp-dir ${tmp}
    --disable-avx-check
    
    org.broadinstitute.hellbender.exceptions.GATKException: Exception waiting for ack from Python: org.broadinstitute.hellbender.exceptions.GATKException: Expected message of length 3 but only found 0 bytes
        at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:233)
        at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:207)
        at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:174)
        at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:307)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:964)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
        at org.broadinstitute.hellbender.Main.main(Main.java:291)
    Caused by: java.util.concurrent.ExecutionException: org.broadinstitute.hellbender.exceptions.GATKException: Expected message of length 3 but only found 0 bytes
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:228)
        ... 10 more
    Caused by: org.broadinstitute.hellbender.exceptions.GATKException: Expected message of length 3 but only found 0 bytes
        at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.getBytesFromStream(StreamingProcessController.java:261)
        at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.lambda$waitForAck$0(StreamingProcessController.java:208)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    

    What is the meaning of "Expected message of length 3 but only found 0 bytes"? What I have to do?

    Many thanks!

  • manolismanolis Member ✭✭

    Hi, I fixed the error about "Expected message of length 3 but only found 0 bytes".

    Just if you have any explanation about bam selection as input in the CCNScoreVariants.

    Many thanks!

  • manolismanolis Member ✭✭

    Hi @samwell

    thanks a lot! I will use the bamout. Is correct that the CNN wdl pipeline is still valid or you have to make some updates?

    All the best!

  • bshifawbshifaw moonMember, Broadie, Moderator admin

    samwell can correct me if I'm wrong but there are no plans to update the CNN wdl pipeline workflow anytime soon.

  • manolismanolis Member ✭✭

    ok, I will keep an eye on the CNN wdl pipeline. Many thanks!

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    @manolis

    Would you please elaborate on how you fixed the error about "Expected message of length 3 but only found 0 bytes". This will help our other users facing similar error.

    Thank you in advance!

  • manolismanolis Member ✭✭
    edited March 13

    Hi @bhanuGandham, in my case was simply... I was testing CNNScoreVariants as an interactive job and I had the above reported error... then I submitted the command as batch job and was everything ok!

    I just have the suspect that the host for the interactive jobs is not fully updated/upgraded and do not support some "gatk" options; on the other side I'm sure that the hosts where I'm running the batch jobs are updated/upgraded. I guess that it was the general problem...

    Unfortunately I do not have more details or specific instructions, I was just lucky.

    Best!

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Thank you for the update @manolis

Sign In or Register to comment.