Description and examples of the steps in the CNV case and CNV PoN creation workflows

LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭
edited March 2017 in GATK 4 Beta

The CNV case and PoN workflows (description and examples) for earlier releases of GATK4.

For a newer tutorial using GATK4's v1.0.0.0-alpha1.2.3 release (Version:0288cff-SNAPSHOT from September 2016), see Article#9143 and this data bundle. If you have a question on the Somatic_CNV_handson tutorial, please post it as a new question using this form.


Requirements

  1. Java 1.8
  2. A functioning GATK4-protected jar (hellbender-protected.jar or gatk-protected.jar)
  3. HDF5 1.8.13
  4. The location of the HDF5-Java JNI Libraries Release 2.9 (2.11 for Macs).
    Typical locations:
    Ubuntu: /usr/lib/jni/
    Mac: /Applications/HDFView.app/Contents/Resources/lib/
    Broad internal servers: /broad/software/free/Linux/redhat_6_x86_64/pkgs/hdfview_2.9/HDFView/lib/linux/
  5. Reference genome (fasta files) with fai and dict files. This can be downloaded as part of the GATK resource bundle: http://www.broadinstitute.org/gatk/guide/article?id=1213
  6. PoN file (when running case samples only). This file should be created using the Create PoN workflow (see below).
  7. Target BED file that was used to create the PoN file. Format details can be found here . NOTE: For the CNV tools, you will need a fourth column for target name, which must be unique across rows.
1       12200   12275   target1
1       13505   13600   target2
1       31000   31500   target3
1       35138   35174   target4
....snip....

Before running the workflows, we recommend padding the target file by 250 bases with the PadTargets tool. Example: java -jar gatk-protected.jar PadTargets --targets initial_target_file.bed --output initial_target_file.padded.bed --padding 250
This allows some off-target reads to be factored into the copy ratio estimates. Our internal evaluations have shown that this improves results.
If you are using the premade Queue scripts (see below), you can specify the padding there and the workflow will generate the padded targets automatically (i.e. there is no reason to run PadTargets explicitly if you are using the premade Queue scripts).

Case sample workflow

This workflow requires a PoN file generated by the Create PoN workflow.

If you do not have a PoN, please skip to the Create PoN workflow, below ....

Overview of steps
  • Step 0. (recommended) Pad Targets (see example above)
  • Step 1. Collect proportional coverage
  • Step 2. Create coverage profile
  • Step 3. Segment coverage profile
  • Step 4. Plot coverage profile
  • Step 5. Call segments
Step 1. Collect proportional coverage
Inputs
  • bam file
  • target bed file -- must be the same that was used for the PoN
  • reference_sequence (required by GATK) -- fasta file with b37 reference.
Outputs
  • Proportional coverage tsv file -- Mx5 matrix of proportional coverage, where M is the number of targets. The fifth column will be named for the sample in the bam file (found in the bam file SM tag). If the file exists, it will be overwritten.
##fileFormat  = tsv
##commandLine = org.broadinstitute.hellbender.tools.exome.ExomeReadCounts  ...snip...
##title       = Read counts per target and sample
CONTIG  START   END     NAME    SAMPLE1
1       12200   12275   target1    1.150e-05
1       13505   13600   target2    1.500e-05
1       31000   31500   target3    7.000e-05
....snip....
Invocation
 java -Xmx8g -jar <path_to_hellbender_protected_jar> CalculateTargetCoverage -I <input_bam_file> -O <pcov_output_file_path>  --targets <target_BED> -R <ref_genome> \ 
       -transform PCOV --targetInformationColumns FULL -groupBy SAMPLE -keepdups
Step 2. Create coverage profile
Inputs
  • proportional coverage file from Step 1
  • target BED file -- must be the same that was used for the PoN
  • PoN file
  • directory containing the HDF5 JNI native libraries
Outputs
  • normalized coverage file (tsv) -- details each target with chromosome, start, end, and log copy ratio estimate
#fileFormat = tsv
#commandLine = ....snip....
#title = ....snip....
name    contig  start   stop    SAMPLE1
target1    1       12200   12275   -0.5958351605220968
target2    1       13505   13600   -0.2855054918109098
target3    1       31000   31500   -0.11450116047248263
....snip....
  • pre-tangent-normalization coverage file (tsv) -- same as normalized coverage file (tsv) above, but copy ratio estimates are before the noise reduction step. The file format is the same as the normalized coverage file (tsv).
  • fnt file (tsv) -- proportional coverage divided by the target factors contained in the PoN. The file format is the same as the proportional coverage in step 1.
  • betaHats (tsv) -- used by developers and evaluators, typically, but output location must be specified. These are the
    coefficients used in the projection of the case sample into the (reducued) PoN. This will be a Mx1 matrix where M is the number of targets.
Invocation
java -Djava.library.path=<hdf_jni_native_dir> -Xmx8g -jar <path_to_hellbender_protected_jar> NormalizeSomaticReadCounts -I <pcov_input_file_path> -T <target_BED> -pon <pon_file> \
 -O <output_target_cr_file> -FNO <output_target_fnt_file> -BHO <output_beta_hats_file> -PTNO <output_pre_tangent_normalization_cr_file>
Step 3. Segment coverage profile
Inputs
  • normalized coverage file (tsv) -- from step 2.
  • sample name
Outputs
  • seg file (tsv) -- segment file (tsv) detailing contig, start, end, and copy ratio (segment_mean) for each detected segment. Note that this is a different format than python recapseg, since the segment mean no longer has log2 applied.
Sample  Chromosome      Start   End     Num_Probes      Segment_Mean
SAMPLE1        1       12200   70000   18       0.841235
SAMPLE1        1       300600  1630000 337     1.23232323
....snip....
Invocation
java -Xmx8g -jar <path_to_hellbender_protected_jar>  PerformSegmentation  -S <sample_name> -T <normalized_coverage_file> -O <output_seg_file> -log
Step 4. Plot coverage profile
Inputs
  • normalized coverage file (tsv) -- from step 2.
  • pre-normalized coverage file (tsv) -- from step 2.
  • segmented coverage file (seg) -- from step 3.
  • sample name, see above
Outputs
  • beforeAfterTangentLimPlot (png) -- Output before/after tangent normalization plot up to copy-ratio 4
  • beforeAfterTangentPlot (png) -- Output before/after tangent normalization plot
  • fullGenomePlot (png) -- Full genome plot after tangent normalization
  • preQc (txt) -- Median absolute differences of targets before normalization
  • postQc (txt) -- Median absolute differences of targets after normalization
  • dQc (txt) -- Difference in median absolute differences of targets before and after normalization
Invocation
java -Xmx8g -jar <path_to_hellbender_protected_jar>  PlotSegmentedCopyRatio  -S <sample_name> -T <normalized_coverage_file> -P <pre_normalized_coverage_file> -seg <segmented_coverage_file> -O <output_seg_file> -log
Step 5. Call segments
Inputs
  • normalized coverage file (tsv) -- from step 2.
  • seg file (tsv) -- from step 3.
  • sample name
Outputs
  • called file (tsv) -- output is exactly the same as in seg file (step 3), except Segment_Call column is added. Calls are either "+", "0", or "-" (no quotes).
Sample  Chromosome      Start   End     Num_Probes      Segment_Mean      Segment_Call
SAMPLE1        1       12200   70000   18       0.841235      -
SAMPLE1        1       300600  1630000 337     1.23232323     0 
....snip....
Invocation
java -Xmx8g -jar <path_to_hellbender_protected_jar> CallSegments -T <normalized_coverage_file> -S <seg_file> -O <output_called_seg_file> -sample <sample_name> 

Create PoN workflow

This workflow can take some time to run depending on how many samples are going into your PoN and the number of targets you are covering. Basic time estimates are found in the Overview of Steps.

Additional requirements
  • Normal sample bam files to be used in the PoN. The index files (.bai) must be local to all of the associated bam files.

Overview of steps
  • Step 1. Collect proportional coverage. (~20 minutes for mean 150x coverage and 150k targets, per sample)
  • Step 2. Combine proportional coverage files (< 5 minutes for 150k targets and 300 samples)
  • Step 3. Create the PoN file (~1.75 hours for 150k targets and 300 samples)

All time estimates are using the internal Broad infrastructure.

Step 1. Collect proportional coverage on each bam file

This is exactly the same as the case sample workflow, except that this needs to be run once for each input bam file, each with a different output file name. Otherwise, the inputs should be the same for each bam file.

Please see documentation above.

IMPORTANT NOTE: You must create a list of the proportional coverage files (i.e. output files) that you create in this step. One output file per line in a text file (see step 2)

Step 2. Merge proportional coverage files

This step merges the proportional coverage files into one large file with a separate column for each samples.

Inputs
  • list of proportional coverage files generated (possibly manually) in step 1. This is a text file.
/path/to/pcov_file1.txt
/path/to/pcov_file2.txt
/path/to/pcov_file3.txt
....snip....
Outputs
  • merged tsv of proportional coverage
CONTIG  START   END     NAME    SAMPLE1    SAMPLE2 SAMPLE3 ....snip....
1       12191   12227   target1    8.835E-6  1.451E-5     1.221E-5    ....snip....
1       12596   12721   target2    1.602E-5  1.534E-5     1.318E-5   ....snip....
....snip....
Invocation
java -Xmx8g -jar  <path_to_hellbender_protected_jar> CombineReadCounts --inputList <text_file_list_of_proportional_coverage_files> \
    -O <output_merged_file> -MOF 200 
Step 3. Create the PoN file
Inputs
  • merged tsv of proportional coverage -- generated in step 2.
Outputs
  • PoN file -- HDF5 format. This file can be used for running case samples sequenced with the same process.
Invocation
java -Xmx16g -Djava.library.path=<hdf_jni_native_dir> -jar <path_to_hellbender_protected_jar> CreatePanelOfNormals -I <merged_pcov_file> \
       -O <output_pon_file_full_path>
Post edited by shlee on

Comments

  • noanoa Boston areaMember

    Will you support the CNV workflow while it's part of GATK4 alpha? thanks

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @noa, the short answer is yes. The longer answer is that right now we're still figuring out how to tackle supporting GATK4 tools in parallel to GATK3, but our support team is gearing up to start supporting the CNV tools very soon -- so don't hesitate to get started with the tools, and let us know if you run into any problems. We will do our best to help you apply the CNV tools to your work.

  • sdubayansdubayan BostonMember, Broadie

    Can the CNV workflow be used for germline WES data?

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭

    @sdubayan Not this workflow. We will be releasing a GATK4 Germline CNV Capture (WES) workflow soon. We are aiming for the end of July or end of August 2016, though this is still tentative.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    How to find 4. The location of the HDF5-Java JNI Libraries Release 2.9 (2.11 for Macs). (See platform instructions above for typical locations)?
    Could you please give me more detailed information?

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭

    @Haiying7 I added the typical locations. However, these may not apply to your environment.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    where did you add typical locations? How can I find right location for my environment?

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭

    @Haiying7 I edited the post for #4

    You can try locate libjhdf5.so:

    # On Ubuntu:
    $ locate libjhdf5.so
    /usr/lib/jni/libjhdf5.so
    
  • Haiying7Haiying7 Heidelberg, GermanyMember

    I do not get any output:

    [[email protected] lib]$ locate libjhdf5.so
    [[email protected] lib]$ uname -a
    Linux hpc85 2.6.18-371.9.1.el5 #1 SMP Tue Jun 10 17:49:56 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

    do I need root?

  • wagnerkwagnerk Member

    Using your target file with the tool PadTarget I get an error message: what is the missing header for a bed-file?

    [[email protected] targets]$ java -jar ~/bin/hellbender-protected.jar PadTargets --targets test.bed --output targets/test_250padded.bed --padding 250
    [07. Juli 2016 09:39:34 MESZ] org.broadinstitute.hellbender.tools.exome.PadTargets --targets test.bed --output targets/test_250padded.bed --padding 250 --help false --version false --verbosity INFO --QUIET false
    [07. Juli 2016 09:39:34 MESZ] Executing as [email protected] at on Linux 4.4.9-300.fc23.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_92-b14; Version: Version:351addc-SNAPSHOT
    09:39:34.292 INFO PadTargets - Defaults.BUFFER_SIZE : 131072
    09:39:34.292 INFO PadTargets - Defaults.COMPRESSION_LEVEL : 5
    09:39:34.292 INFO PadTargets - Defaults.CREATE_INDEX : false
    09:39:34.292 INFO PadTargets - Defaults.CREATE_MD5 : false
    09:39:34.292 INFO PadTargets - Defaults.CUSTOM_READER_FACTORY :
    09:39:34.292 INFO PadTargets - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    09:39:34.293 INFO PadTargets - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    09:39:34.293 INFO PadTargets - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    09:39:34.293 INFO PadTargets - Defaults.REFERENCE_FASTA : null
    09:39:34.293 INFO PadTargets - Defaults.TRY_USE_INTEL_DEFLATER : true
    09:39:34.293 INFO PadTargets - Defaults.USE_ASYNC_IO : false
    09:39:34.293 INFO PadTargets - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    09:39:34.293 INFO PadTargets - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    09:39:34.293 INFO PadTargets - Defaults.USE_CRAM_REF_DOWNLOAD : false
    09:39:34.294 INFO PadTargets - Deflater JdkDeflater
    09:39:34.294 INFO PadTargets - Initializing engine
    09:39:34.294 INFO PadTargets - Done initializing engine
    09:39:34.307 INFO TargetTableReader - Reading targets from file '/home/klaus/CNV/targets/test.bed' ...
    09:39:34.325 INFO PadTargets - Shutting down engine
    [07. Juli 2016 09:39:34 MESZ] org.broadinstitute.hellbender.tools.exome.PadTargets done. Elapsed time: 0,00 minutes.
    Runtime.totalMemory()=246415360


    A USER ERROR has occurred: Bad input: format error in 'test.bed' at line 1: Bad header in file. Not all mandatory columns are present. Missing: 1 12200 12275 target1

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭

    @Haiying7 Has anyone installed hdfview?

  • LeeTL1220LeeTL1220 Arlington, MAMember, Broadie, Dev ✭✭

    @wagnerk I am assuming you are using the codebase in master, not the latest release (1.0.0.0-alpha1.2.1). If I am correct, just run the tool ConvertBedToTargetFile and use the output of that to go into PadTargets.

  • wagnerkwagnerk Member

    @LeeTL1220 said:
    @wagnerk I am assuming you are using the codebase in master, not the latest release (1.0.0.0-alpha1.2.1). If I am correct, just run the tool ConvertBedToTargetFile and use the output of that to go into PadTargets.

    ConvertBedToTargetFile worked and the short test file was padded successfully with 250 bp.

    When I tried my bed file I got an error:
    12:31:29.271 INFO PadTargets - Done initializing engine
    12:31:29.283 INFO TargetTableReader - Reading targets from file '/home/user/CNV/targets/hg38_TruSight_One_v1.1.bed.target' ...
    12:31:29.692 INFO PadTargets - Shutting down engine
    [08. Juli 2016 12:31:29 MESZ] org.broadinstitute.hellbender.tools.exome.PadTargets done. Elapsed time: 0,01 minutes.
    Runtime.totalMemory()=253231104
    java.lang.IllegalArgumentException: Invalid interval. Contig:chr1 start:146019875 end:146019703
    at org.broadinstitute.hellbender.utils.SimpleInterval.validatePositions(SimpleInterval.java:61)
    at org.broadinstitute.hellbender.utils.SimpleInterval.(SimpleInterval.java:36)
    at org.broadinstitute.hellbender.tools.exome.TargetPadder.padTargets(TargetPadder.java:44)
    at org.broadinstitute.hellbender.tools.exome.PadTargets.doWork(PadTargets.java:60)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:70)
    at org.broadinstitute.hellbender.Main.main(Main.java:85)

    When I look into my file at the position with the IllegalArgumentException: Invalid interval the start position is calculated correctly but the stop position is wrong (first line):
    chr1 146020125 146020242 HFE2.chr1.145414781.145414878
    chr1 146019165 146019745 HFE2.chr1.145415278.145415838
    chr1 146018067 146018711 HFE2.chr1.145416312.145416936

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @wagnerk
    Hi,

    Can you please submit some test files so we can debug locally? Instructions are here.

    Thanks,
    Sheila

  • wagnerkwagnerk Member

    @Sheila said:
    @wagnerk
    Hi,

    Can you please submit some test files so we can debug locally? Instructions are here.

    Thanks,
    Sheila

    Hi Shella,

    I have uploaded the file hg38_TruSight_One_v1.1.bed.target to your server.

    Klaus

    Issue · Github
    by Sheila

    Issue Number
    1058
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    chandrans
  • Haiying7Haiying7 Heidelberg, GermanyMember

    @LeeTL1220 said:
    @Haiying7 Has anyone installed hdfview?

    I downloaded hdfview and ran again. I suppose hdfivew is properly installed and I am giving correct directory, but I got error message, and I don't understand what this means. Could you please help me with this?

    the files under the directory are:
    [[email protected] lib]$ pwd
    /home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib
    [[email protected] lib]$ ls
    ext fits.jar jhdf4obj.jar jhdf5.jar jhdf5obj.jar jhdf.jar jhdfobj.jar jhdfview.jar junit.jar linux netcdf.jar

    I still cannot get output:
    [[email protected] lib]$ locate libjhdf5.so
    [[email protected] lib]$

    [[email protected] temp]$ java -Xmx16g -Djava.library.path=/home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib -jar /home/kong/Haiying/lib/GATK4/gatk-protected.jar CreatePanelOfNormals -I merged.txt -O PoN.txt
    [July 10, 2016 6:15:48 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input merged.txt --output PoN.txt --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [July 10, 2016 6:15:48 PM CEST] Executing as [email protected] on Linux 2.6.18-371.9.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27; Version: Version:version-unknown-SNAPSHOT
    18:15:48.996 INFO CreatePanelOfNormals - Defaults.BUFFER_SIZE : 131072
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.COMPRESSION_LEVEL : 5
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.CREATE_INDEX : false
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.CREATE_MD5 : false
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.CUSTOM_READER_FACTORY :
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.REFERENCE_FASTA : null
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.TRY_USE_INTEL_DEFLATER : true
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO : false
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_CRAM_REF_DOWNLOAD : false
    18:15:49.007 INFO CreatePanelOfNormals - Deflater JdkDeflater
    18:15:49.007 INFO CreatePanelOfNormals - Initializing engine
    18:15:49.007 INFO CreatePanelOfNormals - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    16/07/10 18:15:49 INFO SparkContext: Running Spark version 1.5.0
    16/07/10 18:15:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    16/07/10 18:15:50 INFO SecurityManager: Changing view acls to: kong
    16/07/10 18:15:50 INFO SecurityManager: Changing modify acls to: kong
    16/07/10 18:15:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(kong); users with modify permissions: Set(kong)
    16/07/10 18:15:51 INFO Slf4jLogger: Slf4jLogger started
    16/07/10 18:15:51 INFO Remoting: Starting remoting
    16/07/10 18:15:51 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/10 18:15:51 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/10 18:15:51 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/10 18:15:51 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/10 18:15:51 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/10 18:15:51 INFO Slf4jLogger: Slf4jLogger started
    16/07/10 18:15:51 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/10 18:15:51 INFO Remoting: Starting remoting
    16/07/10 18:15:51 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    ........

  • fanghu0104fanghu0104 chinaMember

    if i get paired tumor samples(normal and tumor),would i need to creat PoN file? and how i handle this?

  • fanghu0104fanghu0104 chinaMember

    @Haiying7 said:

    @LeeTL1220 said:
    @Haiying7 Has anyone installed hdfview?

    I downloaded hdfview and ran again. I suppose hdfivew is properly installed and I am giving correct directory, but I got error message, and I don't understand what this means. Could you please help me with this?

    the files under the directory are:
    [[email protected] lib]$ pwd
    /home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib
    [[email protected] lib]$ ls
    ext fits.jar jhdf4obj.jar jhdf5.jar jhdf5obj.jar jhdf.jar jhdfobj.jar jhdfview.jar junit.jar linux netcdf.jar

    I still cannot get output:
    [[email protected] lib]$ locate libjhdf5.so
    [[email protected] lib]$

    [[email protected] temp]$ java -Xmx16g -Djava.library.path=/home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib -jar /home/kong/Haiying/lib/GATK4/gatk-protected.jar CreatePanelOfNormals -I merged.txt -O PoN.txt
    [July 10, 2016 6:15:48 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input merged.txt --output PoN.txt --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [July 10, 2016 6:15:48 PM CEST] Executing as [email protected] on Linux 2.6.18-371.9.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27; Version: Version:version-unknown-SNAPSHOT
    18:15:48.996 INFO CreatePanelOfNormals - Defaults.BUFFER_SIZE : 131072
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.COMPRESSION_LEVEL : 5
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.CREATE_INDEX : false
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.CREATE_MD5 : false
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.CUSTOM_READER_FACTORY :
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    18:15:48.998 INFO CreatePanelOfNormals - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.REFERENCE_FASTA : null
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.TRY_USE_INTEL_DEFLATER : true
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO : false
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    18:15:48.999 INFO CreatePanelOfNormals - Defaults.USE_CRAM_REF_DOWNLOAD : false
    18:15:49.007 INFO CreatePanelOfNormals - Deflater JdkDeflater
    18:15:49.007 INFO CreatePanelOfNormals - Initializing engine
    18:15:49.007 INFO CreatePanelOfNormals - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    16/07/10 18:15:49 INFO SparkContext: Running Spark version 1.5.0
    16/07/10 18:15:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    16/07/10 18:15:50 INFO SecurityManager: Changing view acls to: kong
    16/07/10 18:15:50 INFO SecurityManager: Changing modify acls to: kong
    16/07/10 18:15:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(kong); users with modify permissions: Set(kong)
    16/07/10 18:15:51 INFO Slf4jLogger: Slf4jLogger started
    16/07/10 18:15:51 INFO Remoting: Starting remoting
    16/07/10 18:15:51 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/10 18:15:51 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/10 18:15:51 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/10 18:15:51 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/10 18:15:51 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/10 18:15:51 INFO Slf4jLogger: Slf4jLogger started
    16/07/10 18:15:51 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/10 18:15:51 INFO Remoting: Starting remoting
    16/07/10 18:15:51 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    ........

    i wonder:
    1.you miss the "/" at the end of the "-Djava.library.path=/home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib"
    2.the file merged.txt must contain more than 2 samples?

  • fanghu0104fanghu0104 chinaMember

    Hi @LeeTL1220
    when i ran Step 3. Segment coverage profile, there always with the error bellow, do you know what happened?

    Command Line: Rscript -e tempLibDir = '/tmp/fanghu/Rlib.8000983193739407403';source('/tmp/fanghu/CBS.5651025200789749973.R'); --args --sample_name=J1-A --targets_file=/Step2/normalized_coverage.tsv --output_file=/Step3/segment.tsv --log2_input=TRUE --min_width=2 --alpha=0.01 --nperm=10000 --pmethod=hybrid --kmax=25 --nmin=200 --eta=0.05 --trim=0.025 --undosplits=none --undoprune=0.05 --undoSD=3
    Stdout:
    Stderr: Error in getopt(spec = spec, opt = args) : long flag "args" is invalid
    Calls: source ... withVisible -> eval -> eval -> parse_args -> getopt
    Execution halted

  • Haiying7Haiying7 Heidelberg, GermanyMember

    I still get error. I think hdfview is not correctly installed. The software name is used with different names in different places. Too confusing. Can any one please tell me correct website correct file to download for hdfview installation?

    The error I got is:
    [[email protected] temp]$ java -Xmx16g -Djava.library.path=/home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib -jar $GATK CreatePanelOfNormals \

     -I ${GATK_CNV_dir}Merged.tsv -O ${GATK_CNV_dir}PoN.tsv
    

    [July 15, 2016 2:21:02 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input /home/kong/Haiying/Projects/Nevi/ILSE1059_C9/Lock/CNV/GATK_CNV/Merged.tsv --output /home/kong/Haiying/Projects/Nevi/ILSE1059_C9/Lock/CNV/GATK_CNV/PoN.tsv --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [July 15, 2016 2:21:02 PM CEST] Executing as [email protected] on Linux 2.6.18-371.9.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27; Version: Version:version-unknown-SNAPSHOT
    14:21:02.369 INFO CreatePanelOfNormals - Defaults.BUFFER_SIZE : 131072
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.COMPRESSION_LEVEL : 5
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.CREATE_INDEX : false
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.CREATE_MD5 : false
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.CUSTOM_READER_FACTORY :
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.REFERENCE_FASTA : null
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.TRY_USE_INTEL_DEFLATER : true
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO : false
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_CRAM_REF_DOWNLOAD : false
    14:21:02.376 INFO CreatePanelOfNormals - Deflater JdkDeflater
    14:21:02.376 INFO CreatePanelOfNormals - Initializing engine
    14:21:02.377 INFO CreatePanelOfNormals - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    16/07/15 14:21:04 INFO SparkContext: Running Spark version 1.5.0
    16/07/15 14:21:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    16/07/15 14:21:06 INFO SecurityManager: Changing view acls to: kong
    16/07/15 14:21:06 INFO SecurityManager: Changing modify acls to: kong
    16/07/15 14:21:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(kong); users with modify permissions: Set(kong)
    16/07/15 14:21:08 INFO Slf4jLogger: Slf4jLogger started
    16/07/15 14:21:09 INFO Remoting: Starting remoting
    16/07/15 14:21:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:48887]
    16/07/15 14:21:09 INFO Utils: Successfully started service 'sparkDriver' on port 48887.
    16/07/15 14:21:09 INFO SparkEnv: Registering MapOutputTracker
    16/07/15 14:21:09 INFO SparkEnv: Registering BlockManagerMaster
    16/07/15 14:21:09 INFO DiskBlockManager: Created local directory at /tmp/kong/blockmgr-867cba1e-d3f3-46b8-b4a0-75b18ef3bd6c
    16/07/15 14:21:09 INFO MemoryStore: MemoryStore started with capacity 7.7 GB
    16/07/15 14:21:09 INFO HttpFileServer: HTTP File server directory is /tmp/kong/spark-31ce8e17-404c-4369-9f8a-c6c753bb3841/httpd-6615556c-bd10-4998-880b-3ea1f29e37de
    16/07/15 14:21:09 INFO HttpServer: Starting HTTP Server
    16/07/15 14:21:10 INFO Utils: Successfully started service 'HTTP file server' on port 39623.
    16/07/15 14:21:10 INFO SparkEnv: Registering OutputCommitCoordinator
    16/07/15 14:21:10 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    16/07/15 14:21:10 INFO SparkUI: Started SparkUI at http://193.174.53.248:4040
    16/07/15 14:21:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
    16/07/15 14:21:11 INFO Executor: Starting executor ID driver on host localhost
    16/07/15 14:21:11 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 48075.
    16/07/15 14:21:11 INFO NettyBlockTransferService: Server created on 48075
    16/07/15 14:21:11 INFO BlockManagerMaster: Trying to register BlockManager
    16/07/15 14:21:11 INFO BlockManagerMasterEndpoint: Registering block manager localhost:48075 with 7.7 GB RAM, BlockManagerId(driver, localhost, 48075)
    16/07/15 14:21:11 INFO BlockManagerMaster: Registered BlockManager
    14:21:13.209 INFO CreatePanelOfNormals - QC: Beginning creation of QC PoN...
    14:21:15.129 INFO HDF5PoNCreator - Discarded 71654 target(s) out of 286754 with factors below 1.4e-06 (25.00 percentile)
    java.lang.UnsatisfiedLinkError: no jhdf5 in java.library.path
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1864)
    at java.lang.Runtime.loadLibrary0(Runtime.java:870)
    at java.lang.System.loadLibrary(System.java:1122)
    at ncsa.hdf.hdf5lib.H5.loadH5Lib(H5.java:347)
    at ncsa.hdf.hdf5lib.H5.(H5.java:274)
    at ncsa.hdf.hdf5lib.HDF5Constants.(HDF5Constants.java:28)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5File$OpenMode.(HDF5File.java:505)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.writeTargetFactorNormalizeReadCountsAndTargetFactors(HDF5PoNCreator.java:185)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoNGivenReadCountCollection(HDF5PoNCreator.java:118)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoN(HDF5PoNCreator.java:88)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:244)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)
    16/07/15 14:21:15 INFO SparkUI: Stopped Spark web UI at http://193.174.53.248:4040
    16/07/15 14:21:15 INFO DAGScheduler: Stopping DAGScheduler
    16/07/15 14:21:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    16/07/15 14:21:15 INFO MemoryStore: MemoryStore cleared
    16/07/15 14:21:15 INFO BlockManager: BlockManager stopped
    16/07/15 14:21:15 INFO BlockManagerMaster: BlockManagerMaster stopped
    16/07/15 14:21:15 INFO SparkContext: Successfully stopped SparkContext
    14:21:15.672 INFO CreatePanelOfNormals - Shutting down engine
    [July 15, 2016 2:21:15 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals done. Elapsed time: 0.22 minutes.
    Runtime.totalMemory()=1236271104
    Exception in thread "main" 16/07/15 14:21:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I
    at ncsa.hdf.hdf5lib.H5.H5dont_atexit(Native Method)
    at ncsa.hdf.hdf5lib.H5.loadH5Lib(H5.java:365)
    at ncsa.hdf.hdf5lib.H5.(H5.java:274)
    at ncsa.hdf.hdf5lib.HDF5Constants.(HDF5Constants.java:28)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5File$OpenMode.(HDF5File.java:505)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.writeTargetFactorNormalizeReadCountsAndTargetFactors(HDF5PoNCreator.java:185)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoNGivenReadCountCollection(HDF5PoNCreator.java:118)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoN(HDF5PoNCreator.java:88)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:244)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)
    16/07/15 14:21:15 INFO ShutdownHookManager: Shutdown hook called
    16/07/15 14:21:15 INFO ShutdownHookManager: Deleting directory /tmp/kong/spark-31ce8e17-404c-4369-9f8a-c6c753bb3841

  • fanghu0104fanghu0104 chinaMember

    @Haiying7 said:
    I still get error. I think hdfview is not correctly installed. The software name is used with different names in different places. Too confusing. Can any one please tell me correct website correct file to download for hdfview installation?

    The error I got is:
    [[email protected] temp]$ java -Xmx16g -Djava.library.path=/home/kong/Haiying/lib/hdf-java-2.9/hdfview/HDFView/lib -jar $GATK CreatePanelOfNormals \

     -I ${GATK_CNV_dir}Merged.tsv -O ${GATK_CNV_dir}PoN.tsv
    

    [July 15, 2016 2:21:02 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input /home/kong/Haiying/Projects/Nevi/ILSE1059_C9/Lock/CNV/GATK_CNV/Merged.tsv --output /home/kong/Haiying/Projects/Nevi/ILSE1059_C9/Lock/CNV/GATK_CNV/PoN.tsv --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [July 15, 2016 2:21:02 PM CEST] Executing as [email protected] on Linux 2.6.18-371.9.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27; Version: Version:version-unknown-SNAPSHOT
    14:21:02.369 INFO CreatePanelOfNormals - Defaults.BUFFER_SIZE : 131072
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.COMPRESSION_LEVEL : 5
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.CREATE_INDEX : false
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.CREATE_MD5 : false
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.CUSTOM_READER_FACTORY :
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    14:21:02.371 INFO CreatePanelOfNormals - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.REFERENCE_FASTA : null
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.TRY_USE_INTEL_DEFLATER : true
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO : false
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    14:21:02.372 INFO CreatePanelOfNormals - Defaults.USE_CRAM_REF_DOWNLOAD : false
    14:21:02.376 INFO CreatePanelOfNormals - Deflater JdkDeflater
    14:21:02.376 INFO CreatePanelOfNormals - Initializing engine
    14:21:02.377 INFO CreatePanelOfNormals - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    16/07/15 14:21:04 INFO SparkContext: Running Spark version 1.5.0
    16/07/15 14:21:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    16/07/15 14:21:06 INFO SecurityManager: Changing view acls to: kong
    16/07/15 14:21:06 INFO SecurityManager: Changing modify acls to: kong
    16/07/15 14:21:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(kong); users with modify permissions: Set(kong)
    16/07/15 14:21:08 INFO Slf4jLogger: Slf4jLogger started
    16/07/15 14:21:09 INFO Remoting: Starting remoting
    16/07/15 14:21:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:48887]
    16/07/15 14:21:09 INFO Utils: Successfully started service 'sparkDriver' on port 48887.
    16/07/15 14:21:09 INFO SparkEnv: Registering MapOutputTracker
    16/07/15 14:21:09 INFO SparkEnv: Registering BlockManagerMaster
    16/07/15 14:21:09 INFO DiskBlockManager: Created local directory at /tmp/kong/blockmgr-867cba1e-d3f3-46b8-b4a0-75b18ef3bd6c
    16/07/15 14:21:09 INFO MemoryStore: MemoryStore started with capacity 7.7 GB
    16/07/15 14:21:09 INFO HttpFileServer: HTTP File server directory is /tmp/kong/spark-31ce8e17-404c-4369-9f8a-c6c753bb3841/httpd-6615556c-bd10-4998-880b-3ea1f29e37de
    16/07/15 14:21:09 INFO HttpServer: Starting HTTP Server
    16/07/15 14:21:10 INFO Utils: Successfully started service 'HTTP file server' on port 39623.
    16/07/15 14:21:10 INFO SparkEnv: Registering OutputCommitCoordinator
    16/07/15 14:21:10 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    16/07/15 14:21:10 INFO SparkUI: Started SparkUI at http://193.174.53.248:4040
    16/07/15 14:21:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
    16/07/15 14:21:11 INFO Executor: Starting executor ID driver on host localhost
    16/07/15 14:21:11 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 48075.
    16/07/15 14:21:11 INFO NettyBlockTransferService: Server created on 48075
    16/07/15 14:21:11 INFO BlockManagerMaster: Trying to register BlockManager
    16/07/15 14:21:11 INFO BlockManagerMasterEndpoint: Registering block manager localhost:48075 with 7.7 GB RAM, BlockManagerId(driver, localhost, 48075)
    16/07/15 14:21:11 INFO BlockManagerMaster: Registered BlockManager
    14:21:13.209 INFO CreatePanelOfNormals - QC: Beginning creation of QC PoN...
    14:21:15.129 INFO HDF5PoNCreator - Discarded 71654 target(s) out of 286754 with factors below 1.4e-06 (25.00 percentile)
    java.lang.UnsatisfiedLinkError: no jhdf5 in java.library.path
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1864)
    at java.lang.Runtime.loadLibrary0(Runtime.java:870)
    at java.lang.System.loadLibrary(System.java:1122)
    at ncsa.hdf.hdf5lib.H5.loadH5Lib(H5.java:347)
    at ncsa.hdf.hdf5lib.H5.(H5.java:274)
    at ncsa.hdf.hdf5lib.HDF5Constants.(HDF5Constants.java:28)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5File$OpenMode.(HDF5File.java:505)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.writeTargetFactorNormalizeReadCountsAndTargetFactors(HDF5PoNCreator.java:185)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoNGivenReadCountCollection(HDF5PoNCreator.java:118)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoN(HDF5PoNCreator.java:88)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:244)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)
    16/07/15 14:21:15 INFO SparkUI: Stopped Spark web UI at http://193.174.53.248:4040
    16/07/15 14:21:15 INFO DAGScheduler: Stopping DAGScheduler
    16/07/15 14:21:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    16/07/15 14:21:15 INFO MemoryStore: MemoryStore cleared
    16/07/15 14:21:15 INFO BlockManager: BlockManager stopped
    16/07/15 14:21:15 INFO BlockManagerMaster: BlockManagerMaster stopped
    16/07/15 14:21:15 INFO SparkContext: Successfully stopped SparkContext
    14:21:15.672 INFO CreatePanelOfNormals - Shutting down engine
    [July 15, 2016 2:21:15 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals done. Elapsed time: 0.22 minutes.
    Runtime.totalMemory()=1236271104
    Exception in thread "main" 16/07/15 14:21:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I
    at ncsa.hdf.hdf5lib.H5.H5dont_atexit(Native Method)
    at ncsa.hdf.hdf5lib.H5.loadH5Lib(H5.java:365)
    at ncsa.hdf.hdf5lib.H5.(H5.java:274)
    at ncsa.hdf.hdf5lib.HDF5Constants.(HDF5Constants.java:28)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5File$OpenMode.(HDF5File.java:505)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.writeTargetFactorNormalizeReadCountsAndTargetFactors(HDF5PoNCreator.java:185)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoNGivenReadCountCollection(HDF5PoNCreator.java:118)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoN(HDF5PoNCreator.java:88)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:244)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)
    16/07/15 14:21:15 INFO ShutdownHookManager: Shutdown hook called
    16/07/15 14:21:15 INFO ShutdownHookManager: Deleting directory /tmp/kong/spark-31ce8e17-404c-4369-9f8a-c6c753bb3841

    I think this maybe helpful
    https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.17/bin/linux-centos7-x86_64-gcc485/
    hdf5-1.8.17-linux-centos7-x86_64-gcc485-shared.tar.gz

  • Haiying7Haiying7 Heidelberg, GermanyMember

    I do have hdf5, this one is very clear.
    I am talking about item 4 in the requirement list. hdfView.

    @fanghu0104 said:

    I think this maybe helpful
    https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.17/bin/linux-centos7-x86_64-gcc485/
    hdf5-1.8.17-linux-centos7-x86_64-gcc485-shared.tar.gz

  • fanghu0104fanghu0104 chinaMember

    @Haiying7 said:
    I do have hdf5, this one is very clear.
    I am talking about item 4 in the requirement list. hdfView.

    @fanghu0104 said:

    I think this maybe helpful
    https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.17/bin/linux-centos7-x86_64-gcc485/
    hdf5-1.8.17-linux-centos7-x86_64-gcc485-shared.tar.gz

    sorry, I sent the wrong one.
    below would be OK:
    https://www.hdfgroup.org/ftp/HDF5/hdf-java/current/bin/
    HDFView-2.11-centos6-x64.tar

  • aaroncaaronc Cambridge, MAMember

    @fanghu0104 > @fanghu0104 said:

    Hi @LeeTL1220
    when i ran Step 3. Segment coverage profile, there always with the error bellow, do you know what happened?

    Command Line: Rscript -e tempLibDir = '/tmp/fanghu/Rlib.8000983193739407403';source('/tmp/fanghu/CBS.5651025200789749973.R'); --args --sample_name=J1-A --targets_file=/Step2/normalized_coverage.tsv --output_file=/Step3/segment.tsv --log2_input=TRUE --min_width=2 --alpha=0.01 --nperm=10000 --pmethod=hybrid --kmax=25 --nmin=200 --eta=0.05 --trim=0.025 --undosplits=none --undoprune=0.05 --undoSD=3
    Stdout:
    Stderr: Error in getopt(spec = spec, opt = args) : long flag "args" is invalid
    Calls: source ... withVisible -> eval -> eval -> parse_args -> getopt
    Execution halted

    I believe your error is due to an incorrect version of the optparse and/or getopt packages. Please use the "install_R_packages.R" Rscript to install these. The quoted code below is the relevant portions to your error:

    getoptUrl="http://cran.r-project.org/src/contrib/getopt_1.20.0.tar.gz"
    if (!("getopt" %in% rownames(installed.packages()))) {
    install.packages(getoptUrl, repos=NULL, type="source")
    }
    optparseUrl="http://cran.r-project.org/src/contrib/optparse_1.3.2.tar.gz"
    if (!("optparse" %in% rownames(installed.packages()))) {
    install.packages(optparseUrl, repos=NULL, type="source")
    }

  • fanghu0104fanghu0104 chinaMember

    @aaronc said:
    @fanghu0104 > @fanghu0104 said:

    Hi @LeeTL1220
    when i ran Step 3. Segment coverage profile, there always with the error bellow, do you know what happened?

    Command Line: Rscript -e tempLibDir = '/tmp/fanghu/Rlib.8000983193739407403';source('/tmp/fanghu/CBS.5651025200789749973.R'); --args --sample_name=J1-A --targets_file=/Step2/normalized_coverage.tsv --output_file=/Step3/segment.tsv --log2_input=TRUE --min_width=2 --alpha=0.01 --nperm=10000 --pmethod=hybrid --kmax=25 --nmin=200 --eta=0.05 --trim=0.025 --undosplits=none --undoprune=0.05 --undoSD=3
    Stdout:
    Stderr: Error in getopt(spec = spec, opt = args) : long flag "args" is invalid
    Calls: source ... withVisible -> eval -> eval -> parse_args -> getopt
    Execution halted

    I believe your error is due to an incorrect version of the optparse and/or getopt packages. Please use the "install_R_packages.R" Rscript to install these. The quoted code below is the relevant portions to your error:

    getoptUrl="http://cran.r-project.org/src/contrib/getopt_1.20.0.tar.gz"
    if (!("getopt" %in% rownames(installed.packages()))) {
    install.packages(getoptUrl, repos=NULL, type="source")
    }
    optparseUrl="http://cran.r-project.org/src/contrib/optparse_1.3.2.tar.gz"
    if (!("optparse" %in% rownames(installed.packages()))) {
    install.packages(optparseUrl, repos=NULL, type="source")
    }

    Thank you, I changed the R version, R-3.1.3 or higher is OK.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    I thought we are supposed to use version 2.9 for linux:
    The location of the HDF5-Java JNI Libraries Release 2.9 (2.11 for Macs).

    But I will try with 2.11.

    @fanghu0104 said:

    @Haiying7 said:
    I do have hdf5, this one is very clear.
    I am talking about item 4 in the requirement list. hdfView.

    @fanghu0104 said:

    I think this maybe helpful
    https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.17/bin/linux-centos7-x86_64-gcc485/
    hdf5-1.8.17-linux-centos7-x86_64-gcc485-shared.tar.gz

    sorry, I sent the wrong one.
    below would be OK:
    https://www.hdfgroup.org/ftp/HDF5/hdf-java/current/bin/
    HDFView-2.11-centos6-x64.tar

  • Haiying7Haiying7 Heidelberg, GermanyMember
    edited July 2016

    @aaronc said:

    Lee and aaronc, please help me.

    I am feeling very frustrated. I cannot understand why I am getting this error message:

    [[email protected] temp]$ java -Xmx16g -Djava.library.path=${HDFView} -jar $GATK CreatePanelOfNormals \

     -I ${GATK_CNV_dir}Merged.tsv -O PoN.tsv
    

    [July 20, 2016 12:31:52 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input /home/kong/Haiying/Projects/Nevi/ILSE1059_C9/Lock/CNV/GATK_CNV/Merged.tsv --output PoN.tsv --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [July 20, 2016 12:31:52 PM CEST] Executing as [email protected] on Linux 2.6.18-371.9.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27; Version: Version:version-unknown-SNAPSHOT
    12:31:52.360 INFO CreatePanelOfNormals - Defaults.BUFFER_SIZE : 131072
    12:31:52.379 INFO CreatePanelOfNormals - Defaults.COMPRESSION_LEVEL : 5
    12:31:52.379 INFO CreatePanelOfNormals - Defaults.CREATE_INDEX : false
    12:31:52.379 INFO CreatePanelOfNormals - Defaults.CREATE_MD5 : false
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.CUSTOM_READER_FACTORY :
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.REFERENCE_FASTA : null
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.TRY_USE_INTEL_DEFLATER : true
    12:31:52.384 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO : false
    12:31:52.386 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    12:31:52.386 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    12:31:52.386 INFO CreatePanelOfNormals - Defaults.USE_CRAM_REF_DOWNLOAD : false
    12:31:52.442 INFO CreatePanelOfNormals - Deflater JdkDeflater
    12:31:52.442 INFO CreatePanelOfNormals - Initializing engine
    12:31:52.442 INFO CreatePanelOfNormals - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    16/07/20 12:31:56 INFO SparkContext: Running Spark version 1.5.0
    16/07/20 12:31:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    16/07/20 12:32:01 INFO SecurityManager: Changing view acls to: kong
    16/07/20 12:32:01 INFO SecurityManager: Changing modify acls to: kong
    16/07/20 12:32:01 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(kong); users with modify permissions: Set(kong)
    16/07/20 12:32:08 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:08 INFO Remoting: Starting remoting
    16/07/20 12:32:09 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:10 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:10 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:10 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:10 INFO Remoting: Starting remoting
    16/07/20 12:32:10 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:10 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:10 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:10 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:10 INFO Remoting: Starting remoting
    16/07/20 12:32:10 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:10 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:10 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:10 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:10 INFO Remoting: Starting remoting
    16/07/20 12:32:10 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:10 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:10 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:10 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:10 INFO Remoting: Starting remoting
    16/07/20 12:32:10 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:10 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:10 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:10 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:10 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:10 INFO Remoting: Starting remoting
    16/07/20 12:32:10 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:10 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:11 INFO Remoting: Starting remoting
    16/07/20 12:32:11 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:11 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:11 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:11 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:11 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:12 INFO Remoting: Starting remoting
    16/07/20 12:32:12 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:12 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
    16/07/20 12:32:12 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:12 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/07/20 12:32:12 ERROR Remoting: Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:12 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:12 INFO Slf4jLogger: Slf4jLogger started
    16/07/20 12:32:12 INFO Remoting: Starting remoting
    16/07/20 12:32:12 ERROR NettyTransport: failed to bind to /193.174.53.104:0, shutting down Netty transport
    16/07/20 12:32:12 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/07/20 12:32:12 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    [ERROR] [07/20/2016 12:32:12.213] [sparkDriver-akka.remote.default-remote-dispatcher-7] [Remoting] Remoting system has been terminated abrubtly. Attempting to shut down transports
    16/07/20 12:32:12 ERROR SparkContext: Error initializing SparkContext.
    java.net.BindException: Failed to bind to: /193.174.53.104:0: Service 'sparkDriver' failed after 16 retries!
    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
    at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
    at scala.util.Try$.apply(Try.scala:161)
    at scala.util.Success.map(Try.scala:206)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
    at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
    16/07/20 12:32:12 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
    16/07/20 12:32:12 INFO SparkContext: Successfully stopped SparkContext
    12:32:12.327 INFO CreatePanelOfNormals - Shutting down engine
    [July 20, 2016 12:32:12 PM CEST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals done. Elapsed time: 0.35 minutes.
    Runtime.totalMemory()=336068608
    java.net.BindException: Failed to bind to: /193.174.53.104:0: Service 'sparkDriver' failed after 16 retries!
    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
    at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
    at scala.util.Try$.apply(Try.scala:161)
    at scala.util.Success.map(Try.scala:206)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
    at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

  • aaroncaaronc Cambridge, MAMember

    @Haiying7 You can try running with

    --disableSpark true

    To see if that fixes it.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @aaronc said:
    @Haiying7 You can try running with

    --disableSpark true

    To see if that fixes it.

    Thank you so much. It works now, and I am so happy that I could get CNV list before the end of the week.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    Dear aaronc,

    The step 1 in:
    Case sample workflow
    This workflow requires a PoN file generated by the Create PoN workflow.
    If you do not have a PoN, please skip to the Create PoN workflow, below ....
    Overview of steps
    Step 0. (recommended) Pad Targets (see example above)
    Step 1. Collect proportional coverage
    Step 2. Create coverage profile
    Step 3. Segment coverage profile
    Step 4. Plot coverage profile
    Step 5. Call segments

    and the step1 in:
    Create PoN workflow
    This workflow can take some time to run depending on how many samples are going into your PoN and the number of targets you are covering. Basic time estimates are found in the Overview of Steps.
    Additional requirements
    Normal sample bam files to be used in the PoN. The index files (.bai) must be local to all of the associated bam files.
    Overview of steps
    Step 1. Collect proportional coverage. (~20 minutes for mean 150x coverage and 150k targets, per sample)
    Step 2. Combine proportional coverage files (< 5 minutes for 150k targets and 300 samples)
    Step 3. Create the PoN file (~1.75 hours for 150k targets and 300 samples)

    Are the step 1s same? So if I ran it to create PoN, I do not need to run it again in the main workflow. Right?

    At which step, can I tell the software the matches of samples, i.e. normal and tumor samples ids to match?

    Thank you so much.

    @aaronc said:

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @fanghu0104 said:
    if i get paired tumor samples(normal and tumor),would i need to creat PoN file? and how i handle this?

    Hi,
    At which point and how do we tell the software which samples are paired?

  • Haiying7Haiying7 Heidelberg, GermanyMember

    At the step 2 of the workflow:
    [[email protected] temp]$ java -jar $GATK NormalizeSomaticReadCounts --help
    USAGE: NormalizeSomaticReadCounts [arguments]

    Normalizes PCOV read counts using a panel of normals
    Version:version-unknown-SNAPSHOT

    Required Arguments:

    --input,-I:File read counts input file. This can only contain one sample. Required.

    --panelOfNormals,-PON:File panel of normals HDF5 file Required.

    --tangentNormalized,-TN:File Tangent normalized counts output Required.

    what is -TN here?
    Could anyone please update the document in the original post?

  • aaroncaaronc Cambridge, MAMember

    @Haiying7 The -TN parameter is simply the path to the file you would like to be created by this step of the workflow.

    The step 1. for both these workflows is the same.

    I believe step 2. of the create PoN workflow is where you specify the pairs.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @aaronc said:
    @Haiying7 The -TN parameter is simply the path to the file you would like to be created by this step of the workflow.

    The step 1. for both these workflows is the same.

    I believe step 2. of the create PoN workflow is where you specify the pairs.

    Dear aaronc,
    Thanks for your reply.
    Do you mean I need to merge 2 tsv files from step 1 for each pair?
    So for each patient, I the input list should be
    /path/to/pcov_normal.txt
    /path/to/pcov_tumor.txt
    and this will produce 1 merged.tsv, and at step 3 or create PoN workflow, will produce 1 PoN file. Am I correct?

    Thanks again.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @fanghu0104 said:

    @aaronc said:
    @fanghu0104 > @fanghu0104 said:

    Hi @LeeTL1220
    when i ran Step 3. Segment coverage profile, there always with the error bellow, do you know what happened?

    Command Line: Rscript -e tempLibDir = '/tmp/fanghu/Rlib.8000983193739407403';source('/tmp/fanghu/CBS.5651025200789749973.R'); --args --sample_name=J1-A --targets_file=/Step2/normalized_coverage.tsv --output_file=/Step3/segment.tsv --log2_input=TRUE --min_width=2 --alpha=0.01 --nperm=10000 --pmethod=hybrid --kmax=25 --nmin=200 --eta=0.05 --trim=0.025 --undosplits=none --undoprune=0.05 --undoSD=3
    Stdout:
    Stderr: Error in getopt(spec = spec, opt = args) : long flag "args" is invalid
    Calls: source ... withVisible -> eval -> eval -> parse_args -> getopt
    Execution halted

    I believe your error is due to an incorrect version of the optparse and/or getopt packages. Please use the "install_R_packages.R" Rscript to install these. The quoted code below is the relevant portions to your error:

    getoptUrl="http://cran.r-project.org/src/contrib/getopt_1.20.0.tar.gz"
    if (!("getopt" %in% rownames(installed.packages()))) {
    install.packages(getoptUrl, repos=NULL, type="source")
    }
    optparseUrl="http://cran.r-project.org/src/contrib/optparse_1.3.2.tar.gz"
    if (!("optparse" %in% rownames(installed.packages()))) {
    install.packages(optparseUrl, repos=NULL, type="source")
    }

    Thank you, I changed the R version, R-3.1.3 or higher is OK.

    Hi fanghu0104,

    Did you run the create PoN workflow step 2 and 3 for each sample pair?
    Could you please explain to me how you took care for the paring of samples? It seems to me we have to run the workflow once for each sample pair.

  • aaroncaaronc Cambridge, MAMember

    @Haiying7 Apologies. You do not do anything with pairs for the Create PoN workflow. This is on normals only.

  • fanghu0104fanghu0104 chinaMember

    @Haiying7 said:

    @fanghu0104 said:

    @aaronc said:
    @fanghu0104 > @fanghu0104 said:

    Hi @LeeTL1220
    when i ran Step 3. Segment coverage profile, there always with the error bellow, do you know what happened?

    Command Line: Rscript -e tempLibDir = '/tmp/fanghu/Rlib.8000983193739407403';source('/tmp/fanghu/CBS.5651025200789749973.R'); --args --sample_name=J1-A --targets_file=/Step2/normalized_coverage.tsv --output_file=/Step3/segment.tsv --log2_input=TRUE --min_width=2 --alpha=0.01 --nperm=10000 --pmethod=hybrid --kmax=25 --nmin=200 --eta=0.05 --trim=0.025 --undosplits=none --undoprune=0.05 --undoSD=3
    Stdout:
    Stderr: Error in getopt(spec = spec, opt = args) : long flag "args" is invalid
    Calls: source ... withVisible -> eval -> eval -> parse_args -> getopt
    Execution halted

    I believe your error is due to an incorrect version of the optparse and/or getopt packages. Please use the "install_R_packages.R" Rscript to install these. The quoted code below is the relevant portions to your error:

    getoptUrl="http://cran.r-project.org/src/contrib/getopt_1.20.0.tar.gz"
    if (!("getopt" %in% rownames(installed.packages()))) {
    install.packages(getoptUrl, repos=NULL, type="source")
    }
    optparseUrl="http://cran.r-project.org/src/contrib/optparse_1.3.2.tar.gz"
    if (!("optparse" %in% rownames(installed.packages()))) {
    install.packages(optparseUrl, repos=NULL, type="source")
    }

    Thank you, I changed the R version, R-3.1.3 or higher is OK.

    Hi fanghu0104,

    Did you run the create PoN workflow step 2 and 3 for each sample pair?
    Could you please explain to me how you took care for the paring of samples? It seems to me we have to run the workflow once for each sample pair.

    I think this workflow is not for paired samples. PoN(panel of normal) is created by all normal samples for once.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    I have one sample removed at the Step 3 in Create PoN workflow.
    Could any one please tell me what this means? Why is the sample removed?

    [[email protected] CreatePoN]$ more PoN.removed_samples.txt
    B82772

  • sleeslee Member, Broadie, Dev

    Hi @Haiying7,

    The CreatePanelOfNormals tool performs a simple quality-control check of the input normal samples and removes those that fail the check (specifically, those that appear to contain large, arm-level events). By default, only those samples that pass the check are included in the final panel of normals that is output.

    If you'd like to turn off this quality-control check and include these samples in the panel, you can use the -noQC option. However, this may affect the quality of the tangent normalization of case samples downstream.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @slee said:
    Hi @Haiying7,

    The CreatePanelOfNormals tool performs a simple quality-control check of the input normal samples and removes those that fail the check (specifically, those that appear to contain large, arm-level events). By default, only those samples that pass the check are included in the final panel of normals that is output.

    If you'd like to turn off this quality-control check and include these samples in the panel, you can use the -noQC option. However, this may affect the quality of the tangent normalization of case samples downstream.

    Dear slee,

    Thank you very much for your reply.
    Is the sample quality control specific for CNV?
    When I create PoN, only normal samples that pass this quality control will be used by default.
    Using this PoN, how bad it would be if I run the main work flow even on the failed samples to identify somatic CNV by comparing the call segments outputs from the failed normal sample and matched tumor sample? Would you recommend that I should just exclude the failed normal sample and matched tumor sample for CNV study?

    Thank you very much.

  • aaroncaaronc Cambridge, MAMember
    edited July 2016

    @Haiying7
    If you are potentially interested in differentiating somatic from germline CNVs then you can run the (PoN) excluded normal sample as a case sample and compare to the matched tumor sample. Keep in mind you should not run any normal samples against a PoN that includes those same normal samples.

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @aaronc said:
    @Haiying7
    If you are potentially interested in differentiating somatic from germline CNVs then you can run the (PoN) excluded normal sample as a case sample and compare to the matched tumor sample. Keep in mind you should not run any normal samples against a PoN that includes those same normal samples.

    Yes, I am interested in finding somatic CNVs.

    I did in very similar way except that I included all normal samples in PoN, and run every sample, including tumor and normal, against PoN. Then compared the called segments from matched normal and tumor. To my surprise, all normal samples have about 10-15% more calls than their matched tumor samples.

    If I am understanding your suggestion correctly, I think you meant
    if I have patched samples N1 -- T1, N2 -- T2, N3 -- T3, N4 -- T4, N5 -- T5.
    For N1-T1 pair: get PoN from N2, N3, N4, N5, run main workflow for N1 and T1, and compare the segment calls from N1 and T1.
    For N2-T2 pair: get PoN from N1, N3, N4, N5, run main workflow for N2 and T2, and compare the segment calls from N2 and T2.

    Is this approach correct?

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @Haiying7 said:

    @aaronc said:
    @Haiying7
    If you are potentially interested in differentiating somatic from germline CNVs then you can run the (PoN) excluded normal sample as a case sample and compare to the matched tumor sample. Keep in mind you should not run any normal samples against a PoN that includes those same normal samples.

    Yes, I am interested in finding somatic CNVs.

    I did in very similar way except that I included all normal samples in PoN, and run every sample, including tumor and normal, against PoN. Then compared the called segments from matched normal and tumor. To my surprise, all normal samples have about 10-15% more calls than their matched tumor samples.

    If I am understanding your suggestion correctly, I think you meant
    if I have patched samples N1 -- T1, N2 -- T2, N3 -- T3, N4 -- T4, N5 -- T5.
    For N1-T1 pair: get PoN from N2, N3, N4, N5, run main workflow for N1 and T1, and compare the segment calls from N1 and T1.
    For N2-T2 pair: get PoN from N1, N3, N4, N5, run main workflow for N2 and T2, and compare the segment calls from N2 and T2.

    Is this approach correct?

    @aaronc said:
    @Haiying7
    If you are potentially interested in differentiating somatic from germline CNVs then you can run the (PoN) excluded normal sample as a case sample and compare to the matched tumor sample. Keep in mind you should not run any normal samples against a PoN that includes those same normal samples.

    @aaronc said:
    @Haiying7
    If you are potentially interested in differentiating somatic from germline CNVs then you can run the (PoN) excluded normal sample as a case sample and compare to the matched tumor sample. Keep in mind you should not run any normal samples against a PoN that includes those same normal samples.

    I thought I would get less segment calls if I include the normal in PoN. But I got more segment calls.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
    edited August 2016

    @wagnerk
    Hi Klaus,

    Sorry for the delay. I am not able to reproduce your issue. Can you please send me your original file that you ran ConvertBedToTargetFile on? I am also confused why you are running PadTargets on your file if you already ran ConvertBedToTargetFile and got a padded file?

    Thanks,
    Sheila

  • aaroncaaronc Cambridge, MAMember

    @Haiying7 Normals run against a PoN that include those same samples will have meaningless results. You are correct in your description:

    If I am understanding your suggestion correctly, I think you meant
    if I have patched samples N1 -- T1, N2 -- T2, N3 -- T3, N4 -- T4, N5 -- T5.
    For N1-T1 pair: get PoN from N2, N3, N4, N5, run main workflow for N1 and T1, and compare the segment calls from N1 and T1.
    For N2-T2 pair: get PoN from N1, N3, N4, N5, run main workflow for N2 and T2, and compare the segment calls from N2 and T2.

  • LizzLizz ChinaMember

    when i ran Step 0. there always with the error bellow, do you know what happened? Thank U!
    $java -Xmx8G -jar /home/lizhenzhong/software/gatk4-latest/gatk-protected.jar PadTargets --targets /data/database/hg38_anno/hg38.refGene.merg.sort.filter.bed --output hg38.refGene.merg.sort.filter.padded.bed --padding 250
    [August 17, 2016 11:42:19 AM CST] org.broadinstitute.hellbender.tools.exome.PadTargets --targets /data/database/hg38_anno/hg38.refGene.merg.sort.filter.bed --output hg38.refGene.merg.sort.filter.padded.bed --padding 250 --help false --version false --verbosity INFO --QUIET false
    [August 17, 2016 11:42:19 AM CST] Executing as [email protected] on Linux 2.6.32-358.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_65-b17; Version: Version:version-unknown-SNAPSHOT
    11:42:19.604 INFO PadTargets - Defaults.BUFFER_SIZE : 131072
    11:42:19.606 INFO PadTargets - Defaults.COMPRESSION_LEVEL : 5
    11:42:19.606 INFO PadTargets - Defaults.CREATE_INDEX : false
    11:42:19.606 INFO PadTargets - Defaults.CREATE_MD5 : false
    11:42:19.606 INFO PadTargets - Defaults.CUSTOM_READER_FACTORY :
    11:42:19.606 INFO PadTargets - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    11:42:19.606 INFO PadTargets - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    11:42:19.606 INFO PadTargets - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    11:42:19.606 INFO PadTargets - Defaults.REFERENCE_FASTA : null
    11:42:19.606 INFO PadTargets - Defaults.TRY_USE_INTEL_DEFLATER : true
    11:42:19.606 INFO PadTargets - Defaults.USE_ASYNC_IO : false
    11:42:19.606 INFO PadTargets - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    11:42:19.606 INFO PadTargets - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    11:42:19.607 INFO PadTargets - Defaults.USE_CRAM_REF_DOWNLOAD : false
    11:42:19.615 INFO PadTargets - Deflater JdkDeflater
    11:42:19.615 INFO PadTargets - Initializing engine
    11:42:19.615 INFO PadTargets - Done initializing engine
    11:42:20.086 INFO FeatureManager - Using codec BEDCodec to read file /data/database/hg38_anno/hg38.refGene.merg.sort.filter.bed
    11:42:20.086 INFO TargetUtils - Reading target intervals from exome file '/data/database/hg38_anno/hg38.refGene.merg.sort.filter.bed' ...
    11:42:20.884 INFO PadTargets - Shutting down engine
    [August 17, 2016 11:42:20 AM CST] org.broadinstitute.hellbender.tools.exome.PadTargets done. Elapsed time: 0.02 minutes.
    Runtime.totalMemory()=444071936
    java.lang.IllegalArgumentException: input intervals contain at least two overlapping intervals: [email protected] and [email protected]
    at org.broadinstitute.hellbender.tools.exome.HashedListTargetCollection.checkForOverlaps(HashedListTargetCollection.java:79)
    at org.broadinstitute.hellbender.tools.exome.HashedListTargetCollection.(HashedListTargetCollection.java:63)
    at org.broadinstitute.hellbender.tools.exome.TargetCollectionUtils$2.(TargetCollectionUtils.java:66)
    at org.broadinstitute.hellbender.tools.exome.TargetCollectionUtils.fromBEDFeatureList(TargetCollectionUtils.java:66)
    at org.broadinstitute.hellbender.tools.exome.TargetCollectionUtils.fromBEDFeatureFile(TargetCollectionUtils.java:95)
    at org.broadinstitute.hellbender.tools.exome.TargetUtils.readTargetFile(TargetUtils.java:42)
    at org.broadinstitute.hellbender.tools.exome.PadTargets.doWork(PadTargets.java:54)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)

  • aaroncaaronc Cambridge, MAMember

    @Lizz It looks like your input bed file has overlapping intervals. This is not allowed as it will double-count reads. Please inspect your bed file and either remove overlapping targets or fix the start/end positions so that they do not overlap.

  • LizzLizz ChinaMember

    @aaronc, thank you! i get the bed file from the Gencode v24(exon region) for my WXS data, with filtering the overlapping. But it's still Error!
    1. error message like :
    [August 22, 2016 11:00:07 AM CST] org.broadinstitute.hellbender.tools.exome.PadTargets done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=660078592
    java.lang.IllegalStateException: more than one interval in the input list results in the same name (SAMD11); perhaps repeated: '[email protected]' and '[email protected]'.
    at org.broadinstitute.hellbender.tools.exome.HashedListTargetCollection.composeIntervalsByName(HashedListTargetCollection.java:104)
    at org.broadinstitute.hellbender.tools.exome.HashedListTargetCollection.(HashedListTargetCollection.java:64)

    1. my BED file like as follows:
      chr1 924879 924879 SAMD11 . +
      chr1 925149 925149 SAMD11 . +
      chr1 925737 925737 SAMD11 . +
      chr1 925921 925921 SAMD11 . +

    this error seems that one genename could match only one region, so how to prepare my exon bed file? Thank u!

  • LizzLizz ChinaMember

    @Lizz said:
    @aaronc, thank you! i get the bed file from the Gencode v24(exon region) for my WXS data, with filtering the overlapping. But it's still Error!
    1. error message like :
    [August 22, 2016 11:00:07 AM CST] org.broadinstitute.hellbender.tools.exome.PadTargets done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=660078592
    java.lang.IllegalStateException: more than one interval in the input list results in the same name (SAMD11); perhaps repeated: '[email protected]' and '[email protected]'.
    at org.broadinstitute.hellbender.tools.exome.HashedListTargetCollection.composeIntervalsByName(HashedListTargetCollection.java:104)
    at org.broadinstitute.hellbender.tools.exome.HashedListTargetCollection.(HashedListTargetCollection.java:64)

    1. my BED file like as follows:
      chr1 924879 924879 SAMD11 . +
      chr1 925149 925149 SAMD11 . +
      chr1 925737 925737 SAMD11 . +
      chr1 925921 925921 SAMD11 . +

    this error seems that one genename could match only one region, so how to prepare my exon bed file? Thank u!

    sorry about the wrong format bed file, it just like(still error) :
    chr1 924879 924899 SAMD11 . +
    chr1 925129 925149 SAMD11 . +

  • aaroncaaronc Cambridge, MAMember

    @Lizz Hi Lizz! To fix your bed file simply collapse any repeated region taking the first start position and the last end position:

    chr1 924879 924879 SAMD11 . +
    chr1 925149 925149 SAMD11 . +
    chr1 925737 925737 SAMD11 . +
    chr1 925921 925921 SAMD11 . +
    

    becomes

    chr1 924879 925921 SAMD11 . +
    

    Also you can drop the last column, so the bed file becomes

    chr1 924879 925921 SAMD11
    
  • shilinshilin NashvilleMember

    Hi,

    Would you please help with a problem in making PON. I have made the "merged tsv of proportional coverage" successful for 10 normal samples. It looks good and no outlier values (sample value sum to 1, Min value 0, Q1 5.6e-05, Q3 1.02e-04, Max value 2.60e-03). Then I am going to make PON by:
    java -Xmx32g -Djava.library.path=/HDFView/lib/linux/ -jar gatk-protected.jar CreatePanelOfNormals -I Normal.bed.All.coverage.tsv -O Normal.bed.All.coverage.tsv.PON

    Part of the message:
    16/09/26 16:09:56 INFO Executor: Running task 2.0 in stage 18.0 (TID 588)
    16:09:56.300 INFO CoveragePoNQCUtils - Suspicious contig: Normal1 chr1 (1278.033311668069 -- 0)
    16:09:56.300 INFO CoveragePoNQCUtils - Suspicious contig: Normal7 chr1 (-55.040334768011704 -- 0)
    16:09:56.300 INFO CoveragePoNQCUtils - Suspicious contig: Normal5 chr1 (-34.0587811785343 -- 0)
    16:09:56.300 INFO CoveragePoNQCUtils - Suspicious contig: Normal2 chr1 (-57.683738382363025 -- 0)
    16:09:56.303 INFO CoveragePoNQCUtils - Suspicious contig: Normal8 chr1 (-54.96817475110036 -- 0)
    16:09:56.303 INFO CoveragePoNQCUtils - Suspicious contig: Normal10 chr1 (1844.685903112766 -- 0)
    16:09:56.304 INFO CoveragePoNQCUtils - Suspicious contig: Normal6 chr1 (-59.99297062144822 -- 0)
    16:09:56.304 INFO CoveragePoNQCUtils - Suspicious contig: Normal3 chr1 (239.89156882761088 -- 0)
    16/09/26 16:09:56 INFO Executor: Finished task 0.0 in stage 18.0 (TID 586). 913 bytes result sent to driver
    16/09/26 16:09:56 INFO Executor: Finished task 2.0 in stage 18.0 (TID 588). 912 bytes result sent to driver
    16/09/26 16:09:56 INFO TaskSetManager: Finished task 0.0 in stage 18.0 (TID 586) in 30 ms on localhost (1/4)
    16:09:56.306 INFO CoveragePoNQCUtils - Suspicious contig: Normal9 chr1 (-32.831939726923004 -- 0)
    16/09/26 16:09:56 INFO Executor: Finished task 3.0 in stage 18.0 (TID 589). 921 bytes result sent to driver
    16/09/26 16:09:56 INFO TaskSetManager: Finished task 2.0 in stage 18.0 (TID 588) in 29 ms on localhost (2/4)
    16:09:56.307 INFO CoveragePoNQCUtils - Suspicious contig: Normal4 chr1 (1520.5663308083222 -- 0)
    16/09/26 16:09:56 INFO Executor: Finished task 1.0 in stage 18.0 (TID 587). 921 bytes result sent to driver
    16/09/26 16:09:56 INFO TaskSetManager: Finished task 3.0 in stage 18.0 (TID 589) in 30 ms on localhost (3/4)
    16/09/26 16:09:56 INFO TaskSetManager: Finished task 1.0 in stage 18.0 (TID 587) in 32 ms on localhost (4/4)
    16/09/26 16:09:56 INFO DAGScheduler: ResultStage 18 (collect at CoveragePoNQCUtils.java:111) finished in 0.033 s
    16/09/26 16:09:56 INFO TaskSchedulerImpl: Removed TaskSet 18.0, whose tasks have all completed, from pool
    16/09/26 16:09:56 INFO DAGScheduler: Job 16 finished: collect at CoveragePoNQCUtils.java:111, took 0.036466 s
    16:09:56.311 INFO CreatePanelOfNormals - QC: Suspicious sample list created...
    16:09:56.311 INFO CreatePanelOfNormals - Creating final PoN with 10 suspicious samples removed...

    It seems that it removes all of the 10 normal samples due to "Suspicious contig". I can only add --noQC to make it running. Does anyone know what's wrong with it?
    Thanks!

  • shilinshilin NashvilleMember

    Another question, in Collect proportional coverage, I saw the parameter "-keepdups". Does it mean we should keep duplicate reads in bam? In best practice, it seems we need to remove duplicate (https://software.broadinstitute.org/gatk/best-practices/CNV.php). Thanks!

  • aaroncaaronc Cambridge, MAMember

    @shilin I would run with --noQC and -keepdups. I believe the PoN QC is an issue with creating a PoN of this size. By default -keepdups is enabled because it improves results.

  • noanoa Boston areaMember

    Hi,
    I am trying to generate the PoN and getting an error similar to those reported above, related to hdf5.
    I saw some fixes in the repository made in the last few weeks so I pulled the most recent version of gatk-prtected and built the jar now but I still get the same error.
    I'd appreciate any help, thanks.

    This is the command line:

    java -Xmx16g -Djava.library.path=/home/HDFView-2.13.0-Linux/HDFView/2.13.0/lib -jar /home/gatk4/gatk-protected.jar CreatePanelOfNormals -I /home/Noa/gatk4/mergedPcovFiles.output -O /home/Noa/gatk4/PoN.output

    And this is the output with the error:

    [September 29, 2016 9:31:59 AM EDT] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input /home/Noa/gatk4/mergedPcovFiles.output --output /home/Noa/gatk4/PoN.output --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [September 29, 2016 9:31:59 AM EDT] Executing as [email protected] on Linux 2.6.32-279.14.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_51-b16; Version: Version:version-unknown-SNAPSHOT
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    16/09/29 09:32:00 INFO SparkContext: Running Spark version 1.5.0
    16/09/29 09:32:01 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    16/09/29 09:32:02 INFO SecurityManager: Changing view acls to: henig
    16/09/29 09:32:02 INFO SecurityManager: Changing modify acls to: henig
    16/09/29 09:32:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(henig); users with modify permissions: Set(henig)
    16/09/29 09:32:08 INFO Slf4jLogger: Slf4jLogger started
    16/09/29 09:32:08 INFO Remoting: Starting remoting
    16/09/29 09:32:08 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:47226]
    16/09/29 09:32:08 INFO Utils: Successfully started service 'sparkDriver' on port 47226.
    16/09/29 09:32:08 INFO SparkEnv: Registering MapOutputTracker
    16/09/29 09:32:08 INFO SparkEnv: Registering BlockManagerMaster
    16/09/29 09:32:08 INFO DiskBlockManager: Created local directory at /home/scratch/henig/blockmgr-6d77c841-821f-4198-9162-43b8a67a726d
    16/09/29 09:32:08 INFO MemoryStore: MemoryStore started with capacity 7.7 GB
    16/09/29 09:32:09 INFO HttpFileServer: HTTP File server directory is /home/scratch/henig/spark-518eec1a-b7f3-43ab-bee4-175f55fa02a9/httpd-e8228d6f-c0e8-4b40-a2a7-cac65e3f131d
    16/09/29 09:32:09 INFO HttpServer: Starting HTTP Server
    16/09/29 09:32:09 INFO Utils: Successfully started service 'HTTP file server' on port 46331.
    16/09/29 09:32:09 INFO SparkEnv: Registering OutputCommitCoordinator
    16/09/29 09:32:09 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    16/09/29 09:32:09 INFO SparkUI: Started SparkUI at http://10.1.255.226:4040
    16/09/29 09:32:09 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
    16/09/29 09:32:09 INFO Executor: Starting executor ID driver on host localhost
    16/09/29 09:32:09 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56081.
    16/09/29 09:32:09 INFO NettyBlockTransferService: Server created on 56081
    16/09/29 09:32:09 INFO BlockManagerMaster: Trying to register BlockManager
    16/09/29 09:32:09 INFO BlockManagerMasterEndpoint: Registering block manager localhost:56081 with 7.7 GB RAM, BlockManagerId(driver, localhost, 56081)
    16/09/29 09:32:09 INFO BlockManagerMaster: Registered BlockManager
    16/09/29 09:32:13 INFO SparkUI: Stopped Spark web UI at http://10.1.255.226:4040
    16/09/29 09:32:13 INFO DAGScheduler: Stopping DAGScheduler
    16/09/29 09:32:14 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    16/09/29 09:32:14 INFO MemoryStore: MemoryStore cleared
    16/09/29 09:32:14 INFO BlockManager: BlockManager stopped
    16/09/29 09:32:21 INFO BlockManagerMaster: BlockManagerMaster stopped
    16/09/29 09:32:21 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    16/09/29 09:32:21 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/09/29 09:32:21 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/09/29 09:32:21 INFO SparkContext: Successfully stopped SparkContext
    [September 29, 2016 9:32:21 AM EDT] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals done. Elapsed time: 0.37 minutes.
    Runtime.totalMemory()=1192755200
    Exception in thread "main" java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I
    at ncsa.hdf.hdf5lib.H5.H5dont_atexit(Native Method)
    at ncsa.hdf.hdf5lib.H5.loadH5Lib(H5.java:365)
    at ncsa.hdf.hdf5lib.H5.(H5.java:274)
    at ncsa.hdf.hdf5lib.HDF5Constants.(HDF5Constants.java:28)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5File$OpenMode.(HDF5File.java:505)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.writeTargetFactorNormalizeReadCountsAndTargetFactors(HDF5PoNCreator.java:185)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoNGivenReadCountCollection(HDF5PoNCreator.java:118)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoN(HDF5PoNCreator.java:88)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:244)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)
    16/09/29 09:32:22 INFO ShutdownHookManager: Shutdown hook called
    16/09/29 09:32:22 INFO ShutdownHookManager: Deleting directory /home/scratch/henig/spark-518eec1a-b7f3-43ab-bee4-175f55fa02a9
    16/09/29 09:32:22 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.

  • shilinshilin NashvilleMember

    @aaronc said:
    @shilin I would run with --noQC and -keepdups. I believe the PoN QC is an issue with creating a PoN of this size. By default -keepdups is enabled because it improves results.

    Thanks for your reply! Another question, I have successfully generated the result and there is some called segments. But in step4, the background (X and Y axis, titles, background with color, dashed lines for each chromosome, ...) was generated successful. But there was no dots/lines for CNVs in the figure. Is it a bug of the program?
    Thanks!

  • noanoa Boston areaMember

    Update for my previous message:
    The java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit is solved but I have another exception coming up now:
    These are the last lines out of 3000, the same command line as above:

    java -Xmx16g -Djava.library.path=/home/HDFView-2.13.0-Linux/HDFView/2.13.0/lib -jar /home/gatk4/gatk-protected.jar CreatePanelOfNormals -I /home/Noa/gatk4/mergedPcovFiles.output -O /home/Noa/gatk4/PoN.output

    End of output:
    16/09/29 13:38:43 INFO DAGScheduler: Job 18 finished: collect at CoveragePoNQCUtils.java:111, took 0.049464 s
    16/09/29 13:38:43 INFO SparkUI: Stopped Spark web UI at http://10.1.255.220:4040
    16/09/29 13:38:43 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    16/09/29 13:38:44 INFO MemoryStore: MemoryStore cleared
    16/09/29 13:38:44 INFO BlockManager: BlockManager stopped
    16/09/29 13:38:44 INFO BlockManagerMaster: BlockManagerMaster stopped
    16/09/29 13:38:44 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    16/09/29 13:38:44 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    16/09/29 13:38:44 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    16/09/29 13:38:44 INFO SparkContext: Successfully stopped SparkContext
    [September 29, 2016 1:38:44 PM EDT] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals done. Elapsed time: 0.25 minutes.
    Runtime.totalMemory()=2780823552
    java.lang.IllegalArgumentException: the number of columns to keep must be greater than 0
    at org.broadinstitute.hellbender.tools.exome.ReadCountCollection.subsetColumns(ReadCountCollection.java:266)
    at org.broadinstitute.hellbender.tools.pon.coverage.pca.HDF5PCACoveragePoNCreationUtils.create(HDF5PCACoveragePoNCreationUtils.java:91)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:264)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:108)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:166)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:185)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:76)
    at org.broadinstitute.hellbender.Main.main(Main.java:92)
    16/09/29 13:38:44 INFO ShutdownHookManager: Shutdown hook called
    16/09/29 13:38:44 INFO ShutdownHookManager: Deleting directory /home/scratch/henig/spark-998ca17d-cb96-4fe2-aa82-cab1ad0b666f
    16/09/29 13:38:44 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.

    @aaronc or @Sheila I would be happy if you could help, thanks.
    Noa

  • sleeslee Member, Broadie, Dev

    @shilin Can you confirm which version/release you are running? The latest release (alpha1.2.3) may resolve the plotting issue you are seeing. Also, note that the plotting only supports hg19.

  • sleeslee Member, Broadie, Dev

    @noa How many samples are in your PoN? In the error output, do you see any warnings about samples being dropped (see the post by @shilin) It's possible that creating a PoN with a smallish number of samples (~tens) can give poor results for the quality-control checking routine, causing all samples to be dropped. Can you try running with -noQC?

    Also note that with the latest release (alpha1.2.3), including "-Djava.library.path=/home/HDFView-2.13.0-Linux/HDFView/2.13.0/lib" should not be necessary. Also note that an improved method for QC checking will be released soon---the current method is relatively naive and tries to perform a heuristic check for large, arm-level events.

  • RashRash Member

    Hi, Can I cite CNV caller from GATK4 for publication? If not what do you recommend to call CNVs for somatic variations? is XHMM fine?

    Many thanks, Rahel

  • sleeslee Member, Broadie, Dev

    Other than this discussion post, there is also a technical whitepaper at https://github.com/broadinstitute/gatk-protected/blob/master/docs/CNVs/CNV-methods.pdf. However, note that only some sections are relevant for GATK CNV and that this document will continue to be updated. So if you'd like to cite it in some regard, you may want to link to the specific GitHub commit for the release you are using (e.g., https://github.com/broadinstitute/gatk-protected/blob/1.0.0.0-alpha1.2.3/docs/CNVs/CNV-methods.pdf would be the appropriate link if you were using alpha1.2.3).

    There will be publications forthcoming for both CNV and ACNV, but these will not be ready for a few months and will be based on significantly updated versions of the tools.

  • amjadamjad HelsinkiMember

    PlotSegmentedCopyRatio plots have only the axes and chromosomes without points, any idea why?

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    @amjad,

    You'll have to give us more information for us to begin answering your question.

  • amjadamjad HelsinkiMember

    @shlee It is the same problem seen by @shilin and answered by @slee
    I tried the latest version (alpha1.2.3) without success and my reference genome is hg19.

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin
    edited January 2017

    Hi @amjad,

    Thanks for the clarification. If this version of the tool supports only one reference genome, then that reference is GRCh37 and not hg19. One major difference is that GRCh37 contigs do NOT start with chr whereas hg19 contigs do start with chr. The original documentation by LeeTL1220 shows example data from GRCh37. Some folks call this reference hg37 and others use it interchangeably with hg19 given it is based on the same assembly. Only plotting is constrained in this way and all of your other results should be fine. I believe a newer version of the plotting tool will accommodate any reference. In the meanwhile, I see two quick workarounds. One is to remove the chr prefixes from the data you are trying to plot. If you do this, then be sure the data is sorted by the contig order that GRCh37 would be sorted by and only represents contigs present in GRCh37. That is, you'll have to remove data for extraneous contigs. Second, you could try to visualize your proportional coverage using other means, e.g. R (or RStudio) or IGV. If you try IGV, then I believe you will have to center your normal coverage to either 0 or 1, whichever the CNV data is not centered upon. BTW, IGV's visualization is by heatmap coloring.

  • amjadamjad HelsinkiMember

    Thank you @shlee. It worked after removing the chr characters and sorting the chromosomes :)

  • YujianYujian Member

    Hi ! @LeeTL1220 . I need your help . While running 'CreatePanelOfNormals' , some errors occured. Here is the code i ran 'java -Xmx16g -Djava.library.path=/Workspace/Software/HDFView/HDFView-2.13.0-Linux/HDFView/2.13.0/lib/ -jar /Workspace/Software/gatk4-latest/gatk-protected.jar reatePanelOfNormals -I merge.txt -O PoN.tsv '. And the log file as following :

    [January 9, 2017 6:11:58 PM CST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals --input merge.txt --output PoN.tsv --minimumTargetFactorPercentileThreshold 25.0 --maximumColumnZerosPercentage 2.0 --maximumTargetZerosPercentage 5.0 --extremeColumnMedianCountPercentileThreshold 2.5 --truncatePercentileThreshold 0.1 --numberOfEigenSamples auto --noQC false --dryRun false --disableSpark false --sparkMaster local[*] --help false --version false --verbosity INFO --QUIET false
    [January 9, 2017 6:11:58 PM CST] Executing as [email protected] on Linux 3.10.0-514.2.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_111-b15; Version: Version:version-unknown-SNAPSHOT
    18:11:58.749 INFO CreatePanelOfNormals - Defaults.BUFFER_SIZE : 131072
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.COMPRESSION_LEVEL : 5
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.CREATE_INDEX : false
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.CREATE_MD5 : false
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.CUSTOM_READER_FACTORY :
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.EBI_REFERENCE_SEVICE_URL_MASK : http://www.ebi.ac.uk/ena/cram/md5/%s
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.INTEL_DEFLATER_SHARED_LIBRARY_PATH : null
    18:11:58.750 INFO CreatePanelOfNormals - Defaults.NON_ZERO_BUFFER_SIZE : 131072
    18:11:58.751 INFO CreatePanelOfNormals - Defaults.REFERENCE_FASTA : null
    18:11:58.751 INFO CreatePanelOfNormals - Defaults.TRY_USE_INTEL_DEFLATER : true
    18:11:58.751 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO : false
    18:11:58.751 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_SAMTOOLS : false
    18:11:58.751 INFO CreatePanelOfNormals - Defaults.USE_ASYNC_IO_FOR_TRIBBLE : false
    18:11:58.751 INFO CreatePanelOfNormals - Defaults.USE_CRAM_REF_DOWNLOAD : false
    18:11:58.752 INFO CreatePanelOfNormals - Deflater JdkDeflater
    18:11:58.752 INFO CreatePanelOfNormals - Initializing engine
    18:11:58.753 INFO CreatePanelOfNormals - Done initializing engine
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    17/01/09 18:11:59 INFO SparkContext: Running Spark version 1.5.0
    17/01/09 18:11:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/01/09 18:11:59 INFO SecurityManager: Changing view acls to: yangjiatao
    17/01/09 18:11:59 INFO SecurityManager: Changing modify acls to: yangjiatao
    17/01/09 18:11:59 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yangjiatao); users with modify permissions: Set(yangjiatao)
    17/01/09 18:11:59 INFO Slf4jLogger: Slf4jLogger started
    17/01/09 18:11:59 INFO Remoting: Starting remoting
    17/01/09 18:11:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:46229]
    17/01/09 18:11:59 INFO Utils: Successfully started service 'sparkDriver' on port 46229.
    17/01/09 18:11:59 INFO SparkEnv: Registering MapOutputTracker
    17/01/09 18:11:59 INFO SparkEnv: Registering BlockManagerMaster
    17/01/09 18:12:00 INFO DiskBlockManager: Created local directory at /tmp/yangjiatao/blockmgr-60a33e19-35f0-4969-8ba4-eb844485d298
    17/01/09 18:12:00 INFO MemoryStore: MemoryStore started with capacity 7.7 GB
    17/01/09 18:12:00 INFO HttpFileServer: HTTP File server directory is /tmp/yangjiatao/spark-9dd82bfd-6256-43a9-a5c0-b8b831b0bc21/httpd-927e1211-e7c3-42e8-b528-786f806ed19b
    17/01/09 18:12:00 INFO HttpServer: Starting HTTP Server
    17/01/09 18:12:00 INFO Utils: Successfully started service 'HTTP file server' on port 35578.
    17/01/09 18:12:00 INFO SparkEnv: Registering OutputCommitCoordinator
    17/01/09 18:12:00 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    17/01/09 18:12:00 INFO SparkUI: Started SparkUI at http://192.168.0.131:4040
    17/01/09 18:12:00 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
    17/01/09 18:12:00 INFO Executor: Starting executor ID driver on host localhost
    17/01/09 18:12:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37751.
    17/01/09 18:12:00 INFO NettyBlockTransferService: Server created on 37751
    17/01/09 18:12:00 INFO BlockManagerMaster: Trying to register BlockManager
    17/01/09 18:12:00 INFO BlockManagerMasterEndpoint: Registering block manager localhost:37751 with 7.7 GB RAM, BlockManagerId(driver, localhost, 37751)
    17/01/09 18:12:00 INFO BlockManagerMaster: Registered BlockManager
    18:12:01.177 INFO CreatePanelOfNormals - QC: Beginning creation of QC PoN...
    18:12:01.235 INFO HDF5PoNCreator - All 283 targets are kept
    17/01/09 18:12:01 INFO SparkUI: Stopped Spark web UI at http://192.168.0.131:4040
    17/01/09 18:12:01 INFO DAGScheduler: Stopping DAGScheduler
    17/01/09 18:12:01 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    17/01/09 18:12:01 INFO MemoryStore: MemoryStore cleared
    17/01/09 18:12:01 INFO BlockManager: BlockManager stopped
    17/01/09 18:12:01 INFO BlockManagerMaster: BlockManagerMaster stopped
    17/01/09 18:12:01 INFO SparkContext: Successfully stopped SparkContext
    18:12:01.673 INFO CreatePanelOfNormals - Shutting down engine
    17/01/09 18:12:01 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    [January 9, 2017 6:12:01 PM CST] org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals done. Elapsed time: 0.05 minutes.
    Runtime.totalMemory()=2264399872
    Exception in thread "main" java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I
    at ncsa.hdf.hdf5lib.H5.H5dont_atexit(Native Method)
    at ncsa.hdf.hdf5lib.H5.loadH5Lib(H5.java:365)
    at ncsa.hdf.hdf5lib.H5.(H5.java:274)
    at ncsa.hdf.hdf5lib.HDF5Constants.(HDF5Constants.java:28)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5File$OpenMode.(HDF5File.java:505)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.writeTargetFactorNormalizeReadCountsAndTargetFactors(HDF5PoNCreator.java:185)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoNGivenReadCountCollection(HDF5PoNCreator.java:118)
    at org.broadinstitute.hellbender.utils.hdf5.HDF5PoNCreator.createPoN(HDF5PoNCreator.java:88)
    at org.broadinstitute.hellbender.tools.exome.CreatePanelOfNormals.runPipeline(CreatePanelOfNormals.java:244)
    at org.broadinstitute.hellbender.utils.SparkToggleCommandLineProgram.doWork(SparkToggleCommandLineProgram.java:39)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:102)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:155)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:174)
    at org.broadinstitute.hellbender.Main.instanceMain(Main.java:69)
    at org.broadinstitute.hellbender.Main.main(Main.java:84)
    17/01/09 18:12:01 INFO ShutdownHookManager: Shutdown hook called
    17/01/09 18:12:01 INFO ShutdownHookManager: Deleting directory /tmp/yangjiatao/spark-9dd82bfd-6256-43a9-a5c0-b8b831b0bc21

    @LeeTL1220 , I would be happy if you could help, thanks.

  • sleeslee Member, Broadie, Dev

    Hi @Yujian,

    Could I ask which version of gatk-protected (i.e., commit hash or release number) you are using? For jars built from either the latest commit (d9fa681) or the latest release (alpha1.2.3), you no longer have to specify "-Djava.library.path=/Workspace/Software/HDFView/HDFView-2.13.0-Linux/HDFView/2.13.0/lib/". Using these jars, I was unable to reproduce the exception you encountered when running CreatePanelOfNormals.

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin
    edited March 2017

    For a newer tutorial using GATK4's v1.0.0.0-alpha1.2.3 release (Version:0288cff-SNAPSHOT from September 2016), see this Somatic_CNV worksheet and this data bundle. If you have a question on the Somatic_CNV_handson tutorial, please post it as a new question using this form.

    As of March 14, 2017, I've made the tutorial worksheet a forum article. It is Article#9143.

    Post edited by shlee on
  • RosieQuezadaRosieQuezada MexicoMember

    @amjad said:
    Thank you @shlee. It worked after removing the chr characters and sorting the chromosomes :)

    Hi @amjad i just have the same issue, but i am having trouble sorting the chromosomes, can i ask you how you did it? it will really help me a lot.
    Thank you!

  • llaullau Member

    Hi, I was wondering if you had a workflow for germline CNV/aCNV calling on WES? Thanks so much!

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @llau,

    Germline CNV is under active development currently. Please stay tuned.

  • zhipanzhipan San FranciscoMember

    Hi there... I have a question about the number of EigenSamples. By default, a Jollife’s factor of 0.7 is used. Can someone tell me why this is necessary? Is it purely a computational issue, or there is underlying reason behind this reduction step. Thanks in advance.

  • aaroncaaronc Cambridge, MAMember

    @zhipan Hi Zhipan,

    A factor of 0.7 is somewhat arbitrary, but we cut the eigensamples that contribute a low amount of variance because they are more likely to contain rare germine CNVs which can silence real events in case samples.

  • zhipanzhipan San FranciscoMember
  • xuejiaxuejia CrownbioMember

    This problem has been solved. Thanks to @EADG !
    But now I have another problem while plotting the results. I can get all the output files from computing steps. They all look fine. However, the read dots are missing in the resulting PNG files. I can only see the axes and labels.
    This is the command i used:
    java -jar /home/jxue/softwares/GATK4_CNV/gatk4.jar PlotSegmentedCopyRatio -TN E08055T.tn.tsv -PTN E08055T.ptn.tsv -S E08055T.seg -O . -pre E08055 -LOG -schr

    Thanks a lot in advance!

    Jia

  • sleeslee Member, Broadie, Dev
    edited February 2017

    Hi @xuejia,

    Please see some of the previous posts in this thread. Empty plots may be produced by previous releases if you are not using hg19---is this the case for your data? The latest release (alpha1.2.4) resolves this issue by taking a sequence-dictionary (.dict) file to determine the regions to be plotted, so you may want to try using that release instead.

  • JieChenJieChen shanghaiMember

    Hi @Haiying7 ,

    how did you solve the problem about "Exception in thread "main" java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I".

  • Haiying7Haiying7 Heidelberg, GermanyMember

    @JieChen said:
    Hi @Haiying7 ,

    how did you solve the problem about "Exception in thread "main" java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I".

    which one is this?
    Could you please send me the link?

  • jcorominasjcorominas RadboudumcMember

    Hi! It is possible to run the CNV workflow in germline CNV calling on whole genome data? If so, there is a minimum number of samples needed to run it?
    Lots of thanks!

  • sleeslee Member, Broadie, Dev

    There will be a separate workflow/tool (to be released shortly, timescale on the order of weeks) for calling germline CNVs from both WES and WGS. If you are interested in the details, a poster that the primary developer of the tool presented at AACR can be seen at http://genomicinfo.broadinstitute.org/acton/attachment/13431/f-0186/1/-/-/-/-/AACRPoster_MB.pdf?sid=TV2:isKk4hPeO; a recent talk can be viewed at https://www.broadinstitute.org/videos/scalable-bayesian-model-copy-number-variation-bayesian-pca.

  • maelygauthiermaelygauthier AdelaideMember

    Hi Slee, is there an approximate release date for this germline CNV tool? And is there a development version available to test in the mean time?

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @maelygauthier, this should be available shortly. Alternatively, you can build your own jar from the gatk-protected repo. The tool is experimental and called GermlineCNVCaller.

  • maelygauthiermaelygauthier AdelaideMember

    Thanks @shlee,
    I built my own jar as recommended and testing the GermlineCNVCaller now. I found the annotation table and the transition prior table on your resource bundle page (ftp://ftp.broadinstitute.org/bundle/beta/GermlineCNVCaller/). I was just unsure about how to derive the input required (i.e. the Combined read count collection URI). Do I need to use the CalculateTargetCoverage tool and feed the interval bed file for all the targetted regions of my whole exome panel, or derive read counts at the chromosome level here?

    ./gatk-launch GermlineCNVCaller --contigAnnotationsTable ./grch37_contig_annotations.tsv --copyNumberTransitionPriorTable ./grch37_germline_CN_priors.tsv --jobType LEARN_AND_CALL --outputPath ./trial --input ?

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin
    edited June 2017

    Hi @maelygauthier,

    We've recently merged the gatk-protected repo with the GATK4 repo and I have updated the documentation for GermlineCNVCaller in the newly merged repo with example commands: https://github.com/broadinstitute/gatk/blob/c58d750be88f2fddc3272a45bce447c477f68cbb/src/main/java/org/broadinstitute/hellbender/tools/coveragemodel/germline/GermlineCNVCaller.java

    Be sure to use the latest jar from the gatk repo (in beta status), especially for experimental (alpha status) tools like GermlineCNVCaller that are being actively developed.

    The input is described as:

    Combined read count collection URI. Combined raw or GC corrected (but not proportional) read counts table. Can be for a cohort or for a single sample. Required.

    The input can be the results of CorrectGCBias, SparkGenomeReadCounts or CalculateTargetCoverage. For CalculateTargetCoverage, you may have run across some example commands that use the --transform PCOV option for proportional coverage. Remember that the other option (the default) is to output RAW counts.

    We'll be writing up workflows first in WDL format then for the forum on an ongoing basis for the new tools going forward.

  • maelygauthiermaelygauthier AdelaideMember

    Thanks @shlee for the update,
    I used the latest jar from the gatk4 repo as recommended. And managed to derive the read count input file and sex genotype table. I just wanted to confirm whether Nd4j also needed to be installed if not using Spark.

    script run

    ./gatk-launch GermlineCNVCaller --contigAnnotationsTable ../gatk4_Hellbender/grch37_contig_annotations.tsv --copyNumberTransitionPriorTable ../gatk4_Hellbender/grch37_germline_CN_priors.tsv --jobType LEARN_AND_CALL --outputPath ./TS1 --input ../gatk4_Hellbender/target_cov.tsv --targets ../gatk4_Hellbender/targets.txt --disableSpark true --sexGenotypeTable ../gatk4_Hellbender/TS1_genotype --rddCheckpointing false --biasCovariateSolverType LOCAL

    I am getting the following error which seems to be linked with Nd4j:

    Using GATK jar ~/localwork/playground/programs/gatk-protected/build/libs/gatk-protected-package-b4390fb-SNAPSHOT-local.jar
    102-b14; Version: 4.alpha.2-1136-gc18e780-SNAPSHOT
    16:55:21.931 INFO GermlineCNVCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
    16:55:21.932 INFO GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    16:55:21.932 INFO GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    16:55:21.932 INFO GermlineCNVCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    16:55:21.932 INFO GermlineCNVCaller - Deflater: IntelDeflater
    16:55:21.932 INFO GermlineCNVCaller - Inflater: IntelInflater
    16:55:21.932 INFO GermlineCNVCaller - Initializing engine
    16:55:21.932 INFO GermlineCNVCaller - Done initializing engine
    16:55:21.933 INFO GermlineCNVCaller - Spark disabled. sparkMaster option (local[*]) ignored.
    16:55:23.448 INFO GermlineCNVCaller - Parsing the read counts table...
    16:55:24.876 INFO GermlineCNVCaller - Parsing the sample sex genotypes table...
    16:55:24.896 INFO GermlineCNVCaller - Parsing the germline contig ploidy annotation table...
    16:55:24.906 INFO ContigGermlinePloidyAnnotationTableReader - Ploidy tags: SEX_XX, SEX_XY
    16:55:25.056 INFO GermlineCNVCaller - Parsing the copy number transition prior table and initializing the caches...
    16:55:28.634 INFO GermlineCNVCaller - Initializing the EM algorithm workspace...
    16:55:32.861 INFO GermlineCNVCaller - Shutting down engine
    [June 12, 2017 4:55:32 PM ACST] org.broadinstitute.hellbender.tools.coveragemodel.germline.GermlineCNVCaller done. Elapsed time: 0.18 minutes.
    Runtime.totalMemory()=1364721664
    org.broadinstitute.hellbender.exceptions.GATKException: Nd4j data type must be set to double for coverage modeller routines to function properly. This can be done by setting JVM system property "dtype" to "double". Can not continue.

    Thanks

    Issue · Github
    by shlee

    Issue Number
    3098
    State
    closed
    Last Updated
    Closed By
    samuelklee
  • MehrtashMehrtash DSDEMember, Broadie, Dev

    Hi Maely,

    Nd4j is the linear algebra backend that we use in GATK4 and is already included in the jar file. You need to set the data type to double-precision floating point for Nd4j to behave properly. This is done by passing an additional JVM argument to gatk-launch --javaOptions '-Ddtype=double'. For your case:

    ./gatk-launch GermlineCNVCaller \
    --javaOptions '-Ddtype=double' \
    --contigAnnotationsTable ../gatk4_Hellbender/grch37_contig_annotations.tsv \
    --copyNumberTransitionPriorTable ../gatk4_Hellbender/grch37_germline_CN_priors.tsv \
    --jobType LEARN_AND_CALL \
    --outputPath ./TS1 \
    --input ../gatk4_Hellbender/target_cov.tsv \
    --targets ../gatk4_Hellbender/targets.txt \
    --disableSpark true \
    --sexGenotypeTable ../gatk4_Hellbender/TS1_genotype \
    --rddCheckpointing false \
    --biasCovariateSolverType LOCAL
    

    We will be releasing WDL scripts for running GermlineCNVCaller with the official GATK4 beta release. The scripts will provide further example use cases of GermlineCNVCaller.

    Best,
    Mehrtash

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @maelygauthier,

    I think you only need to set the Nd4j data type based on the following message from the stacktrace:

    Nd4j data type must be set to double for coverage modeller routines to function properly. This can be done by setting JVM system property "dtype" to "double". Can not continue.

    I think you can set this with the java options, e.g.:

    gatk-launch --javaOptions "-Ddtype=double"
    

    Let me know if this works or not. I'll see if we can have this option set automatically for the tool going forward.

  • maelygauthiermaelygauthier AdelaideMember

    @shlee and @Mehrtash, thanks both for your speedy feedback, this fixed the issue. Thanks a lot.

  • JieChenJieChen shanghaiMember

    @Haiying7 said:

    @JieChen said:
    Hi @Haiying7 ,

    how did you solve the problem about "Exception in thread "main" java.lang.UnsatisfiedLinkError: ncsa.hdf.hdf5lib.H5.H5dont_atexit()I".

    which one is this?
    Could you please send me the link?

    it's the version problem, already solved.
    Thanks!

Sign In or Register to comment.