The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.2 is now available at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

Picard IlluminaBasecallsToSam: LIBRARY_PARAM file does not have column BARCODE (but it does)

Hi,

I am following the GATK pipeline and after running Picard CheckIlluminaDirectory and ExtractIlluminaBarcodes successfully, I encounter a problem to generate the BAM files with Illumina BasecallsToSam.

  • Program versions:
    java version "1.8.0_77"
    picardy-tools-2.2.1

  • My command:
    java -jar $HOME/bin/picard-tools-2.2.1/picard.jar IlluminaBasecallsToSam \
    B=/home/user31888/exome_run_1/Data/Intensities/BaseCalls/ \
    BARCODES_DIR=/home/user31888/exome_run_1/Results/ExtractIlluminaBarcodes/ \
    L=1 \
    RS=76T6B76T \
    LIBRARY_PARAMS=/home/user31888/exome_run_1/library_param.tab \
    RUN_BARCODE=exome_run_1

  • 'library_param.tab':
    BARCODE OUTPUT SAMPLE_ALIAS LIBRARY_NAME
    CGATGT sample_1.bam sample_1 exome_run_1
    ACAGTG sample_2.bam sample_2 exome_run_1
    GCCAAT sample_3.bam sample_3 exome_run_1
    CAGATC sample_4.bam sample_4 exome_run_1
    CTTGTA sample_5.bam sample_5 exome_run_1
    GTGAAA sample_6.bam sample_6 exome_run_1

  • Output:
    picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=/home/user31888/exome_run_1/Data/Intensities/BaseCalls/ BARCODES_DIR=/home/user31888/exome_run_1/Results/ExtractIlluminaBarcodes/ LANE=1 RUN_BARCODE=exome_run_1 READ_STRUCTURE=76T6B76T LIBRARY_PARAMS=/home/user31888/exome_run_1/library_param.tab SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
    Executing as user31888@n6 on Linux 2.6.32-504.12.2.el6.664g0000.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03; Picard version: 2.2.1(0256353d7b82ebcf56297abbc510da47a2ddfc0c_1459895860) IntelDeflater
    picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
    Runtime.totalMemory()=995098624
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/user31888/exome_run_1/library_param.tab does not have column BARCODE or BARCODE_1.
    at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337)
    at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251)
    at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

The program does not see the first column BARCODE in the 'library_param.tab' file (same error message when placing the BARCODE column at the end of the file).

Is the format of the 'library_param.tab' correct?
I tried to change the end of line of the file just in case, but still the same problem.
Is it possible that the tab character could be faulty???

Issue · Github
by Sheila

Issue Number
802
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Best Answer

Answers

  • fortunatobianconifortunatobianconi PerugiaMember

    I still have the same problem. This is my library
    BARCODE OUTPUT SAMPLE_ALIAS LIBRARY_NAME
    ATGCCTAA sample_1.bam sample_1 160504_D00793_0015_BH3YY5ADXX
    GAATCTGA sample_2.bam sample_2 160504_D00793_0015_BH3YY5ADXX
    AACGTGAT sample_3.bam sample_3 160504_D00793_0015_BH3YY5ADXX
    CACTTCGA sample_4.bam sample_4 160504_D00793_0015_BH3YY5ADXX
    GCCAAGCA sample_5.bam sample_5 160504_D00793_0015_BH3YY5ADXX
    N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX

    This the error I get
    [Thu May 26 19:31:29 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater
    [Thu May 26 19:31:29 CEST 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
    Runtime.totalMemory()=504889344
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/userOnco/library_par_2.tab does not have column BARCODE or BARCODE_1.
    at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337)
    at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251)
    at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

    Do you have any suggestion, please?
    Thanks

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @fortunatobianconi
    Hi,

    Sorry for the delay. Can you please check that your values are all tab-separated and not space- separated?

    Thanks,
    Sheila

  • Hello, yes I checked and the values are tab separated and also I generate a new file using the command column -t file.
    Still there is the same problems.
    Thanks

    BARCODE OUTPUT SAMPLE_ALIAS LIBRARY_NAME
    ATGCCTAA sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX
    GAATCTGA sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX
    AACGTGAT sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX
    CACTTCGA sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX
    GCCAAGCA sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX
    N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX

    Issue · Github
    by Sheila

    Issue Number
    971
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    dekling
  • deklingdekling Broad InstituteMember

    Hello @fortunatobianconi:
    For some reason I think your LIBRARY_PARAMS file is incorrectly formatted since IlluminaBasecallsToSam will not accept the file as is. Try to reformat the file according to the specs required... OUTPUT, SAMPLE_ALIAS, and LIBRARY_NAME, BARCODE_1, BARCODE_2, etc. all tab separated. Make sure you use the header 'BARCODE_1' and not 'BARCODE'. Let us know if this helps.

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: I seem to also recall some issues when people created files in MS Excel and then output the results as a .txt file. If you do this, make sure you run the file through a text writer e.g. TextEdit prior to inputting into the tool. This will remove Excel-specific tags.

  • OK I check again and the specs seems to be fine, I do not use excel but linux command line column -t file_name to get tab.
    Do you have any example to test?

  • deklingdekling Broad InstituteMember

    Unfortunately I do not. I was hoping you could send me your files so I can test it on my machine.

  • ok I attached .rar because .tab is not allowed.
    Thanks

    rar
    rar
    library_par_2.rar
    299B
  • deklingdekling Broad InstituteMember

    @fortunatobianconi: I do not know if the formatting of the file was altered when I opened it with my text editor. Needless to say, none of the columns were tab separated, but did have spaces. Try this and let me know if it works.

    txt
    txt
    library_par_2.txt
    438B
  • I used this file but It did not work.
    I attached the snipped figure for both error and file opened with text editor.(both linux and Windows SO get the same format)

    Capture2.PNG
    1693 x 184 - 24K
    Capture.PNG
    998 x 216 - 17K
  • deklingdekling Broad InstituteMember

    Hi @fortunatobianconi:
    Can you write out your complete set of command lines, Program version, output, etc. as the user at the top of the page did? This would help us in diagnosing the problem. Thanks in advance.

  • Program versions:
    java version "1.8.0_92"
    picard-tools-2.2.4

    My command:
    java -jar /home/userOnco/picard-tools-2.2.4/picard.jar IlluminaBasecallsToSam BASECALLS_DIR=/data/user1i/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls/ LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T7B76T NUM_PROCESSORS=10 FORCE_GC=false LIBRARY_PARAMS=/home/userOnco/library_par_2.txt

    library_par_2.txt':
    OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE_1
    sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX ATGCCTAA
    sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX GAATCTGA
    sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX AACGTGAT
    sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX CACTTCGA
    sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX GCCAAGCA
    non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX N

    Output:
    [Fri Jun 10 15:52:15 CEST 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=/data/user1/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T7B76T LIBRARY_PARAMS=/home/userOnco/library_par_2.txt NUM_PROCESSORS=10 FORCE_GC=false SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
    [Fri Jun 10 15:52:15 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater
    [Fri Jun 10 15:52:15 CEST 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
    Runtime.totalMemory()=504889344
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/userOnco/library_par_2.txt does not have column BARCODE or BARCODE_1.
    at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337)
    at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251)
    at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: OK. I think I got it. Your readstructure is 76T7B76T which means that you have 7 bases in your barcode. However your barcodes have 8 bases. Change the read structure to 76T8B76T. Let me know if that helps.

  • Thanks, I tried but still the same problem

    java -jar /home/userOnco/picard-tools-2.2.4/picard.jar IlluminaBasecallsToSam BASECALLS_DIR=/data/user1/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls/ LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T8B76T NUM_PROCESSORS=10 FORCE_GC=false LIBRARY_PARAMS=/home/userOnco/library_par_4.tab
    [Fri Jun 10 23:39:38 CEST 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=/data/Tiacci/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T8B76T LIBRARY_PARAMS=/home/userOnco/library_par_4.tab NUM_PROCESSORS=10 FORCE_GC=false SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
    [Fri Jun 10 23:39:38 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater
    [Fri Jun 10 23:39:38 CEST 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
    Runtime.totalMemory()=504889344
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/userOnco/library_par_4.tab does not have column BARCODE or BARCODE_1.
    at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337)
    at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251)
    at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

  • library_par_4.tab
    BARCODE_1 OUTPUT SAMPLE_ALIAS LIBRARY_NAME
    ATGCCTAA sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX
    GAATCTGA sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX
    AACGTGAT sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX
    CACTTCGA sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX
    GCCAAGCA sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX
    N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX

  • deklingdekling Broad InstituteMember

    Hi @fortunatobianconi:
    I managed to get the program to work. Here is my command line:
    $ java -jar picard.jar IlluminaBasecallsToSam \
    BASECALLS_DIR=/Users/basecallDirectory/Intensities/BaseCalls \
    LANE=1 \
    RUN_BARCODE=AACTTGAC \
    READ_STRUCTURE=25T8B25T \
    LIBRARY_PARAMS=/Users/barcodes/library.params
    IGNORE_UNEXPECTED_BARCODES=true

    I am attaching the contents of the library.params file that I used. I don't know why it is not working for you but it might be the last line in the command set.
    OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE
    SA_AAAAAAAA.bam SA_AAAAAAAA LN_AAAAAAAA AAAAAAAA
    SA_AAAAGAAG.bam SA_AAAAGAAG LN_AAAAGAAG AAAAGAAG
    SA_AACAATGG.bam SA_AACAATGG LN_AACAATGG AACAATGG
    SA_AACGCATT.bam SA_AACGCATT LN_AACGCATT AACGCATT
    SA_ACAAAATT.bam SA_ACAAAATT LN_ACAAAATT ACAAAATT
    SA_ACAGGTAT.bam SA_ACAGGTAT LN_ACAGGTAT ACAGGTAT
    SA_ACAGTTGA.bam SA_ACAGTTGA LN_ACAGTTGA ACAGTTGA
    SA_ACCAGTTG.bam SA_ACCAGTTG LN_ACCAGTTG ACCAGTTG
    SA_ACGAAATC.bam SA_ACGAAATC LN_ACGAAATC ACGAAATC
    SA_ACTAAGAC.bam SA_ACTAAGAC LN_ACTAAGAC ACTAAGAC
    SA_ACTGTACC.bam SA_ACTGTACC LN_ACTGTACC ACTGTACC
    SA_ACTGTATC.bam SA_ACTGTATC LN_ACTGTATC ACTGTATC
    SA_AGAAAAGA.bam SA_AGAAAAGA LN_AGAAAAGA AGAAAAGA
    SA_AGCATGGA.bam SA_AGCATGGA LN_AGCATGGA AGCATGGA
    SA_AGGTAAGG.bam SA_AGGTAAGG LN_AGGTAAGG AGGTAAGG
    SA_AGGTCGCA.bam SA_AGGTCGCA LN_AGGTCGCA AGGTCGCA
    SA_ATTATCAA.bam SA_ATTATCAA LN_ATTATCAA ATTATCAA
    SA_ATTCCTCT.bam SA_ATTCCTCT LN_ATTCCTCT ATTCCTCT
    SA_non_indexed.bam SA_non_indexed LN_NNNNNNNN N

    Let me know if this works for you or not.

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: Have you managed to get the tool to work?

  • I did not solve the problem even using your command line. It could be is a JVM version problem, Now I'm thinking to generate fastq and from fastq sam.. The IlluminaBasecallsToFastq apparently is working. I'm still not accepting that IlluminaBasecallsToSam is not working.. All the suggestions are welcome

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: The format of your library.params file seems to be fine. I used it successfully on my machine. Are you sure about your read structure? If it is not too much trouble, you can send me the BaseCall file and I can try it here. Otherwise, let me know if there is anything else I can do. BTW, your JVM is 1.8.0_92-b14, which should be fine.

  • fortunatobianconifortunatobianconi PerugiaMember
    edited June 2016

    Tose are the run parameters:
    Reads>
    Read Number="1" NumCycles="76" IsIndexedRead="N" />
    Read Number="2" NumCycles="7" IsIndexedRead="Y" />
    Read Number="3" NumCycles="76" IsIndexedRead="N" />
    Reads>
    Do you have an ftp where upload the basecalls data?

  • I also run
    [Mon Jun 20 18:54:11 CEST 2016] picard.illumina.CheckIlluminaDirectory BASECALLS_DIR=/data//user1/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls READ_STRUCTURE=76T7B76T LANES=[2] FAKE_FILES=false LINK_LOCS=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
    [Mon Jun 20 18:54:11 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater
    INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Checking lanes(2 in basecalls directory (/data/Tiacci/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls)

    INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Expected cycles: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159
    INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Checking lane 2
    INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Expected tiles: 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 2101, 2102, 2103, 2104, 2105, 2106, 2107, 2108, 2109, 2110, 2111, 2112, 2113, 2114, 2115, 2116, 2201, 2202, 2203, 2204, 2205, 2206, 2207, 2208, 2209, 2210, 2211, 2212, 2213, 2214, 2215, 2216
    INFO 2016-06-20 18:54:12 CheckIlluminaDirectory Lane 2 SUCCEEDED
    INFO 2016-06-20 18:54:12 CheckIlluminaDirectory SUCCEEDED! All required files are present and non-empty

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: Can you please run ExtractIlluminaBarcodes. There seems to be a problem with the read structure.

  • deklingdekling Broad InstituteMember

    Since your barcode is 8 bases, you might try adjusting your read structure to 75T8B76T or 76T8B75T.

  • fortunatobianconifortunatobianconi PerugiaMember
    edited June 2016

    This is the command line using of ExtractIlluminaBarcodes
    We have a barcode length of 8 but the run parameters has been set 76 7 76,

    htsjdk.samtools.metrics.StringHeader
    picard.illumina.ExtractIlluminaBarcodes BASECALLS_DIR=BaseCalls LANE=2 READ_STRUCTURE=76T7B76T BARCODE=[ATGCCTA$
    htsjdk.samtools.metrics.StringHeader
    Started on: Tue Jun 21 15:14:20 CEST 2016

    METRICS CLASS picard.illumina.ExtractIlluminaBarcodes$BarcodeMetric
    BARCODE BARCODE_NAME LIBRARY_NAME READS PF_READS PERFECT_MATCHES PF_PERFECT_MATCHES ONE_MISMATCH_MATCHES PF_ONE_MISMATCH_MATCHES PCT_MATCHES RATIO_THIS_BARCODE_T$
    ATGCCTAA 12958564 12723883 12845842 12627716 112722 96167 0.12588 0.296903 0.126415 0.296823 0.649996
    GAATCTGA 16051217 15756501 15892890 15620810 158327 135691 0.155922 0.367761 0.156545 0.367567 0.804916
    AACGTGAT 43645769 42866973 43371041 42649692 274728 217281 0.423977 1 0.425896 1 2.189846
    CACTTCGA 14409680 14143684 14271779 14021304 137901 122380 0.139976 0.330151 0.140522 0.329944 0.722526
    GCCAAGCA 12606522 12385646 4842 4664 12601680 12380982 0.12246 0.288837 0.123055 0.288932 0.632717
    NNNNNNN 3272050 2774658 0 0 0 0 0.031785 0.074968 0.027567 0.064727 0

  • deklingdekling Broad InstituteMember

    Hello @fortunatobianconi: Were you able to successfully use IlluminaBasecallsToFastq and convert the FASTQ to SAM? If you are still having issues, you can submit the entire run folder so we can try it here. This link describes the protocol for uploading your folder. https://www.broadinstitute.org/gatk/guide/article?id=1894

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: One other thing, I managed to get the tool to work with your library.params file as well. Here it is:

    BARCODE_1 OUTPUT SAMPLE_ALIAS LIBRARY_NAME
    ATGCCTAA sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX
    GAATCTGA sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX
    AACGTGAT sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX
    CACTTCGA sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX
    GCCAAGCA sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX
    N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX

  • Thanks for the suggestion. Using IlluminaBasecallsToFastq worked if I do not use the parameter MULTIPLEX_PARAMS. So probably there is a problem in the tab file I'm generating from command line in linux. Could you please send me the library.params file as attached file?

  • deklingdekling Broad InstituteMember

    @fortunatobianconi: Happy to do so. However, you can simply copy the data above and paste into a text editor.

    txt
    txt
    test_1_params.txt
    437B
  • Yes, I did copy and paste from you post... but I had the same problem.. Using your file is working. I was using nano command line. I want to be sure that the problem is nano . Thanks for the file..
    I have a question. If my barcode is 8B and the run has been performed setting the parameter to 7 B when I extract the barcode is ok if I use as read structure 76T7B75T or I need to force the read structure to 76T7B76T has the parameter of the run (of course changing also library file)?

  • deklingdekling Broad InstituteMember

    In my experience, your read structure cannot exceed the total number of reads e.g. if your read structure is 76T7B76T, the total number of reads is 159. The program will crash if your read structure encodes 160 reads or more. However, you can use a total number of reads that is less than 159 without causing a failure.

  • Ok thanks.When you produce your tab file you were using window or linux? Waht program you were using? Thanks again

  • deklingdekling Broad InstituteMember

    Neither. Used a Mac.

  • deklingdekling Broad InstituteMember

    To be more specific, I just simply created a text file using TextEdit.app on a Mac.

  • SumuduSumudu Sri LankaMember

    Hi,

    I ran CheckilluminaDirectory and it was successful. But IlluminaBaseCallsToSam didn't work for me. Is it really necessary to follow this step or would it give me a fair result (even though not the best practice as mentioned in GATK pipeline) when proceed with the BAM file I received by running BWA-MEM ?

    Thanks
    Sumudu

    Issue · Github
    by Sheila

    Issue Number
    1382
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    chandrans
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    @Sumudu The main reason why we use the workflow you're referring to, with an unmapped BAM intermediate, is because it offers more robust data management opportunities. The output is qualitatively the same as starting from a FASTQ (which I assume is what you provided to BWA), except that the Picard tools that deal with unmapped BAMs have additional built-in cleanup options that can eliminate some sources of invalid records.

  • SumuduSumudu Sri LankaMember

    Thank you for your answer. Just want to clarify whether running CleanSam in Picard would do the same cleaning as can be achieved by running IlluminaBaseCallsToSam and then MergeBamAlignment.

    I'm confused with the documentation which says "CleanSam Cleans the provided SAM/BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads". Whereas "MergeBamAlignment Merges alignment data from a SAM or BAM with data in an unmapped BAM file. The purpose of this tool is to use information from the unmapped BAM to fix up aligner output".

    Are these two suppose to do the same cleaning?

    Thanks
    Regards
    Sumudu

    Issue · Github
    by Sheila

    Issue Number
    1394
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • SumuduSumudu Sri LankaMember

    Hi,

    I managed to create my library_params.txt to work and now IlluminaBasecallsToSam is running though it seems slow. I have 8 samples and my BaseCalls directory has 16 fastq files R1(forward) & R2(reverse) for each sample. Does IlluminaBasecallsToSam take this in to account? Does it create a single uBAM for a sample with R1 & R2 fastq files? Please clarify me this.

    Thank you

    -

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi @Sumudu, sorry for the late response. Here is the answer from @Sheila:

    CleanSam is not doing the same cleaning as MergeBamAlignment. CleanSam does the exact two things stated in the documentation. IlluminaBaseCallsToSam converts BCL data (not FASTQ) to an unmapped BAM/SAM file. You need to run BWA (or aligner of your choice) on the unmapped data; we do this by running SamToFastq and piping the output to BWA because BWA only accepts FASTQ input. Then, MergeBamAlignment will add the information from the unmapped reads to the mapped BAM file. Have a look at this article for more information on MergeBamAlignment.

    I believe the tool will produce a single uBAM per library containing both forward and reverse reads.

Sign In or Register to comment.