Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Get Error when using CreateReadCountPanelOfNormals in Calling Somatic Copy Number Variation

**Error information:** Using GATK jar /home/yangyuan/Desktop/Tool/gatk-4.0.5.2/gatk-package-4.0.5.2-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6500m -jar /home/yangyuan/Desktop/Tool/gatk-4.0.5.2/gatk-package-4.0.5.2-local.jar CreateReadCountPanelOfNormals -I 1_19_0427_S18.counts.hdf5 -I 1_20_0427_S19.counts.hdf5 -I 1_21_0427_S20.counts.hdf5 -I 1_22_0427_S21.counts.hdf5 -I 1_23_0427_S22.counts.hdf5 -I 1_24_0427_S23.counts.hdf5 -I 1_25_0427_S24.counts.hdf5 -I 1_26_0427_S25.counts.hdf5 -I 1_50_0427_S48.counts.hdf5 -I 1_51_0427_S49.counts.hdf5 -I ...... ...... ...... ...... 18/07/29 19:46:57 INFO Executor: Running task 32.0 in stage 1.0 (TID 33) 18/07/29 19:46:57 INFO Executor: Running task 33.0 in stage 1.0 (TID 34) 18/07/29 19:46:57 INFO Executor: Running task 34.0 in stage 1.0 (TID 35) 18/07/29 19:46:57 INFO Executor: Running task 35.0 in stage 1.0 (TID 36) 18/07/29 19:46:57 INFO Executor: Running task 36.0 in stage 1.0 (TID 37) 18/07/29 19:46:57 INFO Executor: Running task 37.0 in stage 1.0 (TID 38) 18/07/29 19:46:57 INFO Executor: Running task 38.0 in stage 1.0 (TID 39) 18/07/29 19:46:57 INFO Executor: Running task 39.0 in stage 1.0 (TID 40) Jul 29, 2018 7:46:57 PM com.github.fommil.jni.JniLoader liberalLoad INFO: successfully loaded /tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so java: symbol lookup errorjavajavajava: : symbol lookup errorsymbol lookup error: java: /tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so: : /tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so: javasymbol lookup error/tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so: symbol lookup error: : java: undefined symbol: cblas_dspr: undefined symbol: cblas_dspr: : symbol lookup errorjava/tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.soundefined symbol: cblas_dsprsymbol lookup error/tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so: java: : : java
: /tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so
undefined symbol: cblas_dspr/tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.sosymbol lookup error: java: symbol lookup error: java: symbol lookup error: : : /tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so: /tmp/yangyuan/jniloader7881962181404460704netlib-native_system-linux-x86_64.so:
undefined symbol: cblas_dspr

Hi, when i'm using CreateReadCountPanelOfNormals in Calling Somatic Copy Number Variation, i got error.
When i search the error on Google, i found someone meet the same problem with me (https://gatkforums.broadinstitute.org/gatk/discussion/8810/something-about-create-pon-workflow).

But the solution did not work for me, this is the solution the above link give:

I met this problem too. it was running very well with one sample input, but this bug appeared when I input multiple samples... BTW, my version is 4.0.3.0.
It seems related to Spark, and I just solved it.
1. install libblas.so, liblapacke.so and libopenblas.so(which I lacked).
2. add to environment. export LD_PRELOAD=/path/to/libopenblas.so
Then everything works as expected.

The command i input was:
gatk --java-options "-Xmx6500m" CreateReadCountPanelOfNormals \
-I 1_19_0427_S18.counts.hdf5 \
-I 1_20_0427_S19.counts.hdf5 \
-I 1_21_0427_S20.counts.hdf5 \
-I 1_22_0427_S21.counts.hdf5 \
-I 1_23_0427_S22.counts.hdf5 \
-I 1_24_0427_S23.counts.hdf5 \
-I 1_25_0427_S24.counts.hdf5 \
-I 1_26_0427_S25.counts.hdf5 \
-I 1_50_0427_S48.counts.hdf5 \
-I 1_51_0427_S49.counts.hdf5 \
-I 1_52_0427_S50.counts.hdf5 \
-I 1_53_0427_S51.counts.hdf5 \
-I 1_54_0427_S52.counts.hdf5 \
-I 1_55_0427_S53.counts.hdf5 \
-I 1_56_0427_S54.counts.hdf5 \
-I 1_57_0427_S55.counts.hdf5 \
-I 1_58_0427_S56.counts.hdf5 \
-I 1_59_0427_S57.counts.hdf5 \
--minimum-interval-median-percentile 55.0 \
-O cnvponC.pon.hdf5

Answers

  • MuyiyuanMuyiyuan Member

    Sorry, i forget give the version of GATK, the version i use is GATK 4.0.5.0

  • SheilaSheila Broad InstituteMember, Broadie admin

    @Muyiyuan
    Hi,

    Let me ask someone from the team to get back to you.

    -Sheila

  • sleeslee Member, Broadie, Dev ✭✭✭

    @Muyiyuan,

    Can you give us some more information about your runtime environment and how you installed the BLAS libraries? Are you sure you pointed to the appropriate path using the export LD_PRELOAD=... statement?

    Also, on an unrelated topic, looking at your command line, that is a rather extreme value of minimum-interval-median-percentile---you will be throwing away 55% of your bins. Just make sure that is indeed what you want to do!

    Thanks,
    Samuel

  • Hi all,
    I am getting the same exact error when trying to make a panel of normals using CreateReadCountPanelOfNormals and here is the error:

    I started with 5 samples using this command for each one alone and it was successful
    gatk CollectReadCounts \ -I /home/projects/BAM/Sample1.bam \ -L /home/projects/targets_C.preprocessed.interval_list \ --interval-merging-rule OVERLAPPING_ONLY \ -O /home/projects/CNV_GATK/Sample1.counts.hdf5

    Then I used this command where I got the error:
    gatk CreateReadCountPanelOfNormals \ -I /home/projects/CNV_GATK/Sample1.counts.hdf5 \ -I /home/projects/CNV_GATK/Sample2.counts.hdf5 \ -I /home/projects/CNV_GATK/Sample3.counts.hdf5 \ -I /home/projects/CNV_GATK/Sample4.counts.hdf5 \ -I /home/projects/CNV_GATK/Sample5.counts.hdf5 \ --minimum-interval-median-percentile 5.0 \ -O /home/projects/CNV_GATK/cnvponC.pon.hdf5

    I read this thread as well as another one posted previously on this forum https://gatkforums.broadinstitute.org/gatk/discussion/8810/something-about-create-pon-workflow and I really have no clue about the solution. I am working with GATK 4.1.0.0 installed on an HPC. I tried working under the GATK environment activated as well as deactivated (just in case has something to do with python). I searched in our HPC provider libraries and they mention that they have the required libraries https://hpc.dtu.dk/?page_id=335

    I am really stuck here and would appreciate any suggestion/advice.

    Thanks in advance

  • NawarDalilaNawarDalila Member
    edited February 13

    Just in case the error screenshot was not clear, here it is as text:

    Feb 13, 2019 11:47:49 AM com.github.fommil.jni.JniLoader liberalLoad
    INFO: successfully loaded /tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.so
    javajava: symbol lookup error: javasymbol lookup error: javajavajava: symbol lookup error: : : /tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.so: java/tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.sojavasymbol lookup errorsymbol lookup error: symbol lookup error: : : : undefined symbol: cblas_dspr/tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.so: : javaundefined symbol: cblas_dspr/tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.sojavasymbol lookup error: /tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.so: java: symbol lookup error: : javaundefined symbol: cblas_dspr: /tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.sosymbol lookup errorjavaundefined symbol: cblas_dspr:
    symbol lookup error: undefined symbol: cblas_dsprjava

  • AdelaideRAdelaideR Member admin

    Hello @NawarDalila

    This error still seems to be related to the suggestion provided by @slee, there is a conflict in your libraries on the VM. You can read more about this type of error here.

    Try working through the suggestions above and take a look at the stack overflow article to get an idea of how to set up your libraries on your VM to overcome this problem.

  • Thank you all very much and specially @AdelaideR @slee.

    Following again all the discussions and tips I found that my problem could indeed be solved by this command:

    export LD_PRELOAD=/services/tools/openblas/0.2.20/lib64/libopenblas.so:/services/tools/lapack/3.8.0/lib64/libblas.so

    I hope it might be helpful for others (non programmers) to know that one MUST refer to the file itself in the path and NOT to the directory of the path.

    Best regards
    Nawar

  • sleeslee Member, Broadie, Dev ✭✭✭

    @NawarDalila glad you were able to resolve the issue and thanks for sharing your findings!

  • jejacobs23jejacobs23 Portland, ORMember

    Hello. I am having the same issue with CreateReadCountPanelOfNormlas on GATK 4.1.3.0. This is the command I enter:

    COMMON_DIR="/home/exacloud/lustre1/jjacobs"
    GATK=$COMMON_DIR"/programs/gatk-4.1.3.0"
    
    INPUT_DIR=$COMMON_DIR"/data/osteo"
    OUTPUT_DIR=$COMMON_DIR"/data/osteo"
    
    srun $GATK/gatk --java-options "-Xmx6500m" CreateReadCountPanelOfNormals \
        -I $COMMON_DIR/data/osteo/SJOS001101_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001105_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001108_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001109_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001111_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001120_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001124_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001125_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001126_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS001128_G1/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS002_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS004_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS005_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS008_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS012_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS013_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS015_G/sample.counts.hdf5 \
        -I $COMMON_DIR/data/osteo/SJOS019_G/sample.counts.hdf5 \
        --minimum-interval-median-percentile 5.0 \
        -O $OUTPUT_DIR/cnvponC_F.pon.hdf5
    

    I get the following error:

    java: symbol lookup error: /tmp/jniloader1283220320052883201netlib-native_system-linux-x86_64.so: undefined symbol: cblas_dspr

    I'm afraid I don't understand the solution and discussion above. Is there a program that I need to have in my cluster-computing environment that is not there? Can I simply copy and paste @NawarDalila's solution into my GATK command? And if so, what will that line of code actually do?

  • NawarDalilaNawarDalila Member

    @jejacobs23 I think your error is different from the one I got previously because as you could see above I had the message

    INFO: successfully loaded /tmp/jniloader7233171560396596450netlib-native_system-linux-x86_64.so

    But to my limited understanding of the linux system and its libraries and working with HPC, I think it has something to do with loading the correct JAVA before running the tools. Maybe this will hint you to find the solution.

  • jejacobs23jejacobs23 Portland, ORMember

    @NawarDalila, the error looks to be the same. The entire output of the stderr file is quite long but here are the last few lines.

    19/10/09 09:49:10 INFO SparkContext: Starting job: first at RowMatrix.scala:61
    19/10/09 09:49:10 INFO DAGScheduler: Got job 0 (first at RowMatrix.scala:61) with 1 output partitions
    19/10/09 09:49:10 INFO DAGScheduler: Final stage: ResultStage 0 (first at RowMatrix.scala:61)
    19/10/09 09:49:10 INFO DAGScheduler: Parents of final stage: List()
    19/10/09 09:49:10 INFO DAGScheduler: Missing parents: List()
    19/10/09 09:49:10 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at SparkConverter.java:47), which has no missing parents
    19/10/09 09:49:10 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1704.0 B, free 3.5 GB)
    19/10/09 09:49:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1067.0 B, free 3.5 GB)
    19/10/09 09:49:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on exanode-3-4.local:34527 (size: 1067.0 B, free: 3.5 GB)
    19/10/09 09:49:13 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161
    19/10/09 09:49:13 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at SparkConverter.java:47) (first 15 tasks are for partitions Vector(0))
    19/10/09 09:49:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
    19/10/09 09:49:13 WARN TaskSetManager: Stage 0 contains a task of very large size (4168 KB). The maximum recommended task size is 100 KB.
    19/10/09 09:49:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 4268519 bytes)
    19/10/09 09:49:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    19/10/09 09:49:14 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 860 bytes result sent to driver
    19/10/09 09:49:14 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 763 ms on localhost (executor driver) (1/1)
    19/10/09 09:49:14 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
    19/10/09 09:49:14 INFO DAGScheduler: ResultStage 0 (first at RowMatrix.scala:61) finished in 4.078 s
    19/10/09 09:49:14 INFO DAGScheduler: Job 0 finished: first at RowMatrix.scala:61, took 4.267622 s
    19/10/09 09:49:17 INFO SparkContext: Starting job: treeAggregate at RowMatrix.scala:122
    19/10/09 09:49:17 INFO DAGScheduler: Registering RDD 2 (treeAggregate at RowMatrix.scala:122)
    19/10/09 09:49:17 INFO DAGScheduler: Got job 1 (treeAggregate at RowMatrix.scala:122) with 10 output partitions
    19/10/09 09:49:17 INFO DAGScheduler: Final stage: ResultStage 2 (treeAggregate at RowMatrix.scala:122)
    19/10/09 09:49:17 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
    19/10/09 09:49:17 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 1)
    19/10/09 09:49:17 INFO DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[2] at treeAggregate at RowMatrix.scala:122), which has no missing parents
    19/10/09 09:49:17 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 5.4 KB, free 3.5 GB)
    19/10/09 09:49:17 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.2 KB, free 3.5 GB)
    19/10/09 09:49:17 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on exanode-3-4.local:34527 (size: 3.2 KB, free: 3.5 GB)
    19/10/09 09:49:17 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1161
    19/10/09 09:49:17 INFO DAGScheduler: Submitting 100 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[2] at treeAggregate at RowMatrix.scala:122) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1$
    19/10/09 09:49:17 INFO TaskSchedulerImpl: Adding task set 1.0 with 100 tasks
    19/10/09 09:49:17 WARN TaskSetManager: Stage 1 contains a task of very large size (4168 KB). The maximum recommended task size is 100 KB.
    19/10/09 09:49:17 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, executor driver, partition 0, PROCESS_LOCAL, 4268508 bytes)
    19/10/09 09:49:17 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
    Oct 09, 2019 9:49:22 AM com.github.fommil.jni.JniLoader liberalLoad
    INFO: successfully loaded /tmp/jniloader1283220320052883201netlib-native_system-linux-x86_64.so
    java: symbol lookup error: /tmp/jniloader1283220320052883201netlib-native_system-linux-x86_64.so: undefined symbol: cblas_dspr
    srun: error: exanode-3-4: task 0: Exited with exit code 127
    
  • sleeslee Member, Broadie, Dev ✭✭✭

    @jejacobs23 The issue is that the Spark MLlib package that the tool uses to perform SVD relies upon a native linear algebra library BLAS; this native library is loaded by the Java com.github.fommil.jni.JniLoader package. So you need to make sure a suitable BLAS package is 1) installed and 2) linked to in your environment running the appropriate export LD_PRELOAD=.... statement prior to running the tool. See https://stackoverflow.com/questions/38133885/symbol-lookup-error-with-netlib-java and the links there (which includes a link from a previous response above) for more information.

Sign In or Register to comment.