The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

Picard 2.10.2 is now available at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

# Picard IlluminaBasecallsToSam: LIBRARY_PARAM file does not have column BARCODE (but it does)

Member

Hi,

I am following the GATK pipeline and after running Picard CheckIlluminaDirectory and ExtractIlluminaBarcodes successfully, I encounter a problem to generate the BAM files with Illumina BasecallsToSam.

• Program versions:
java version "1.8.0_77"
picardy-tools-2.2.1

• My command:
java -jar $HOME/bin/picard-tools-2.2.1/picard.jar IlluminaBasecallsToSam \ B=/home/user31888/exome_run_1/Data/Intensities/BaseCalls/ \ BARCODES_DIR=/home/user31888/exome_run_1/Results/ExtractIlluminaBarcodes/ \ L=1 \ RS=76T6B76T \ LIBRARY_PARAMS=/home/user31888/exome_run_1/library_param.tab \ RUN_BARCODE=exome_run_1 • 'library_param.tab': BARCODE OUTPUT SAMPLE_ALIAS LIBRARY_NAME CGATGT sample_1.bam sample_1 exome_run_1 ACAGTG sample_2.bam sample_2 exome_run_1 GCCAAT sample_3.bam sample_3 exome_run_1 CAGATC sample_4.bam sample_4 exome_run_1 CTTGTA sample_5.bam sample_5 exome_run_1 GTGAAA sample_6.bam sample_6 exome_run_1 • Output: picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=/home/user31888/exome_run_1/Data/Intensities/BaseCalls/ BARCODES_DIR=/home/user31888/exome_run_1/Results/ExtractIlluminaBarcodes/ LANE=1 RUN_BARCODE=exome_run_1 READ_STRUCTURE=76T6B76T LIBRARY_PARAMS=/home/user31888/exome_run_1/library_param.tab SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json Executing as user31888@n6 on Linux 2.6.32-504.12.2.el6.664g0000.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03; Picard version: 2.2.1(0256353d7b82ebcf56297abbc510da47a2ddfc0c_1459895860) IntelDeflater picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=995098624 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/user31888/exome_run_1/library_param.tab does not have column BARCODE or BARCODE_1. at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337) at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251) at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105) The program does not see the first column BARCODE in the 'library_param.tab' file (same error message when placing the BARCODE column at the end of the file). Is the format of the 'library_param.tab' correct? I tried to change the end of line of the file just in case, but still the same problem. Is it possible that the tab character could be faulty??? Tagged: #### Issue · Github April 2016 by Sheila Issue Number 802 State closed Last Updated Assignee Array Milestone Array Closed By chandrans ## Best Answer ## Answers • PerugiaMember I still have the same problem. This is my library BARCODE OUTPUT SAMPLE_ALIAS LIBRARY_NAME ATGCCTAA sample_1.bam sample_1 160504_D00793_0015_BH3YY5ADXX GAATCTGA sample_2.bam sample_2 160504_D00793_0015_BH3YY5ADXX AACGTGAT sample_3.bam sample_3 160504_D00793_0015_BH3YY5ADXX CACTTCGA sample_4.bam sample_4 160504_D00793_0015_BH3YY5ADXX GCCAAGCA sample_5.bam sample_5 160504_D00793_0015_BH3YY5ADXX N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX This the error I get [Thu May 26 19:31:29 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater [Thu May 26 19:31:29 CEST 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=504889344 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/userOnco/library_par_2.tab does not have column BARCODE or BARCODE_1. at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337) at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251) at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105) Do you have any suggestion, please? Thanks • Broad InstituteMember, Broadie, Moderator Sorry for the delay. Can you please check that your values are all tab-separated and not space- separated? Thanks, Sheila • PerugiaMember Hello, yes I checked and the values are tab separated and also I generate a new file using the command column -t file. Still there is the same problems. Thanks BARCODE OUTPUT SAMPLE_ALIAS LIBRARY_NAME ATGCCTAA sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX GAATCTGA sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX AACGTGAT sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX CACTTCGA sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX GCCAAGCA sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX #### Issue · Github June 2016 by Sheila Issue Number 971 State closed Last Updated Assignee Array Milestone Array Closed By dekling • Broad InstituteMember Hello @fortunatobianconi: For some reason I think your LIBRARY_PARAMS file is incorrectly formatted since IlluminaBasecallsToSam will not accept the file as is. Try to reformat the file according to the specs required... OUTPUT, SAMPLE_ALIAS, and LIBRARY_NAME, BARCODE_1, BARCODE_2, etc. all tab separated. Make sure you use the header 'BARCODE_1' and not 'BARCODE'. Let us know if this helps. • Broad InstituteMember @fortunatobianconi: I seem to also recall some issues when people created files in MS Excel and then output the results as a .txt file. If you do this, make sure you run the file through a text writer e.g. TextEdit prior to inputting into the tool. This will remove Excel-specific tags. • PerugiaMember OK I check again and the specs seems to be fine, I do not use excel but linux command line column -t file_name to get tab. Do you have any example to test? • Broad InstituteMember Unfortunately I do not. I was hoping you could send me your files so I can test it on my machine. • PerugiaMember ok I attached .rar because .tab is not allowed. Thanks • Broad InstituteMember @fortunatobianconi: I do not know if the formatting of the file was altered when I opened it with my text editor. Needless to say, none of the columns were tab separated, but did have spaces. Try this and let me know if it works. • PerugiaMember I used this file but It did not work. I attached the snipped figure for both error and file opened with text editor.(both linux and Windows SO get the same format) • Broad InstituteMember Hi @fortunatobianconi: Can you write out your complete set of command lines, Program version, output, etc. as the user at the top of the page did? This would help us in diagnosing the problem. Thanks in advance. • PerugiaMember Program versions: java version "1.8.0_92" picard-tools-2.2.4 My command: java -jar /home/userOnco/picard-tools-2.2.4/picard.jar IlluminaBasecallsToSam BASECALLS_DIR=/data/user1i/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls/ LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T7B76T NUM_PROCESSORS=10 FORCE_GC=false LIBRARY_PARAMS=/home/userOnco/library_par_2.txt library_par_2.txt': OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE_1 sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX ATGCCTAA sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX GAATCTGA sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX AACGTGAT sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX CACTTCGA sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX GCCAAGCA non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX N Output: [Fri Jun 10 15:52:15 CEST 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=/data/user1/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T7B76T LIBRARY_PARAMS=/home/userOnco/library_par_2.txt NUM_PROCESSORS=10 FORCE_GC=false SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Fri Jun 10 15:52:15 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater [Fri Jun 10 15:52:15 CEST 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=504889344 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/userOnco/library_par_2.txt does not have column BARCODE or BARCODE_1. at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337) at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251) at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105) • Broad InstituteMember @fortunatobianconi: OK. I think I got it. Your readstructure is 76T7B76T which means that you have 7 bases in your barcode. However your barcodes have 8 bases. Change the read structure to 76T8B76T. Let me know if that helps. • PerugiaMember Thanks, I tried but still the same problem java -jar /home/userOnco/picard-tools-2.2.4/picard.jar IlluminaBasecallsToSam BASECALLS_DIR=/data/user1/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls/ LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T8B76T NUM_PROCESSORS=10 FORCE_GC=false LIBRARY_PARAMS=/home/userOnco/library_par_4.tab [Fri Jun 10 23:39:38 CEST 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=/data/Tiacci/run_05_05_2016/160504_D00793_0015_BH3YY5ADXX/Data/Intensities/BaseCalls LANE=2 RUN_BARCODE=160504_D00793_0015_BH3YY5ADXX READ_STRUCTURE=76T8B76T LIBRARY_PARAMS=/home/userOnco/library_par_4.tab NUM_PROCESSORS=10 FORCE_GC=false SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Fri Jun 10 23:39:38 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater [Fri Jun 10 23:39:38 CEST 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=504889344 To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp Exception in thread "main" picard.PicardException: LIBRARY_PARAMS(BARCODE_PARAMS) file /home/userOnco/library_par_4.tab does not have column BARCODE or BARCODE_1. at picard.illumina.IlluminaBasecallsToSam.populateWritersFromLibraryParams(IlluminaBasecallsToSam.java:337) at picard.illumina.IlluminaBasecallsToSam.initialize(IlluminaBasecallsToSam.java:251) at picard.illumina.IlluminaBasecallsToSam.doWork(IlluminaBasecallsToSam.java:229) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105) • PerugiaMember library_par_4.tab BARCODE_1 OUTPUT SAMPLE_ALIAS LIBRARY_NAME ATGCCTAA sample_DIAGNOSIM6.bam sample_1 160504_D00793_0015_BH3YY5ADXX GAATCTGA sample_CRM6.bam sample_2 160504_D00793_0015_BH3YY5ADXX AACGTGAT sample_RELAPSEM6.bam sample_3 160504_D00793_0015_BH3YY5ADXX CACTTCGA sample_NCABRA.bam sample_4 160504_D00793_0015_BH3YY5ADXX GCCAAGCA sample_SAKBRA.bam sample_5 160504_D00793_0015_BH3YY5ADXX N non_indexed.bam non_indexed 160504_D00793_0015_BH3YY5ADXX • Broad InstituteMember Hi @fortunatobianconi: I managed to get the program to work. Here is my command line:$ java -jar picard.jar IlluminaBasecallsToSam \
BASECALLS_DIR=/Users/basecallDirectory/Intensities/BaseCalls \
LANE=1 \
RUN_BARCODE=AACTTGAC \
LIBRARY_PARAMS=/Users/barcodes/library.params
IGNORE_UNEXPECTED_BARCODES=true

I am attaching the contents of the library.params file that I used. I don't know why it is not working for you but it might be the last line in the command set.
OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE
SA_AAAAAAAA.bam SA_AAAAAAAA LN_AAAAAAAA AAAAAAAA
SA_AAAAGAAG.bam SA_AAAAGAAG LN_AAAAGAAG AAAAGAAG
SA_AACAATGG.bam SA_AACAATGG LN_AACAATGG AACAATGG
SA_AACGCATT.bam SA_AACGCATT LN_AACGCATT AACGCATT
SA_ACAAAATT.bam SA_ACAAAATT LN_ACAAAATT ACAAAATT
SA_ACAGGTAT.bam SA_ACAGGTAT LN_ACAGGTAT ACAGGTAT
SA_ACAGTTGA.bam SA_ACAGTTGA LN_ACAGTTGA ACAGTTGA
SA_ACCAGTTG.bam SA_ACCAGTTG LN_ACCAGTTG ACCAGTTG
SA_ACGAAATC.bam SA_ACGAAATC LN_ACGAAATC ACGAAATC
SA_ACTAAGAC.bam SA_ACTAAGAC LN_ACTAAGAC ACTAAGAC
SA_ACTGTACC.bam SA_ACTGTACC LN_ACTGTACC ACTGTACC
SA_ACTGTATC.bam SA_ACTGTATC LN_ACTGTATC ACTGTATC
SA_AGAAAAGA.bam SA_AGAAAAGA LN_AGAAAAGA AGAAAAGA
SA_AGCATGGA.bam SA_AGCATGGA LN_AGCATGGA AGCATGGA
SA_AGGTAAGG.bam SA_AGGTAAGG LN_AGGTAAGG AGGTAAGG
SA_AGGTCGCA.bam SA_AGGTCGCA LN_AGGTCGCA AGGTCGCA
SA_ATTATCAA.bam SA_ATTATCAA LN_ATTATCAA ATTATCAA
SA_ATTCCTCT.bam SA_ATTCCTCT LN_ATTCCTCT ATTCCTCT
SA_non_indexed.bam SA_non_indexed LN_NNNNNNNN N

Let me know if this works for you or not.

@fortunatobianconi: Have you managed to get the tool to work?

• PerugiaMember

I did not solve the problem even using your command line. It could be is a JVM version problem, Now I'm thinking to generate fastq and from fastq sam.. The IlluminaBasecallsToFastq apparently is working. I'm still not accepting that IlluminaBasecallsToSam is not working.. All the suggestions are welcome

@fortunatobianconi: The format of your library.params file seems to be fine. I used it successfully on my machine. Are you sure about your read structure? If it is not too much trouble, you can send me the BaseCall file and I can try it here. Otherwise, let me know if there is anything else I can do. BTW, your JVM is 1.8.0_92-b14, which should be fine.

• PerugiaMember
edited June 2016

Tose are the run parameters:
Do you have an ftp where upload the basecalls data?

• PerugiaMember

I also run
[Mon Jun 20 18:54:11 CEST 2016] Executing as root@fe on Linux 2.6.32-573.7.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_92-b14; Picard version: 2.2.4(920e3247c340720b009f2398c1b93cce132c9bed_1461793281) IntelDeflater

INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Expected cycles: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159
INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Checking lane 2
INFO 2016-06-20 18:54:11 CheckIlluminaDirectory Expected tiles: 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 2101, 2102, 2103, 2104, 2105, 2106, 2107, 2108, 2109, 2110, 2111, 2112, 2113, 2114, 2115, 2116, 2201, 2202, 2203, 2204, 2205, 2206, 2207, 2208, 2209, 2210, 2211, 2212, 2213, 2214, 2215, 2216
INFO 2016-06-20 18:54:12 CheckIlluminaDirectory Lane 2 SUCCEEDED
INFO 2016-06-20 18:54:12 CheckIlluminaDirectory SUCCEEDED! All required files are present and non-empty

@fortunatobianconi: Can you please run ExtractIlluminaBarcodes. There seems to be a problem with the read structure.

• PerugiaMember
edited June 2016

This is the command line using of ExtractIlluminaBarcodes
We have a barcode length of 8 but the run parameters has been set 76 7 76,

picard.illumina.ExtractIlluminaBarcodes BASECALLS_DIR=BaseCalls LANE=2 READ_STRUCTURE=76T7B76T BARCODE=[ATGCCTA$htsjdk.samtools.metrics.StringHeader Started on: Tue Jun 21 15:14:20 CEST 2016 METRICS CLASS picard.illumina.ExtractIlluminaBarcodes$BarcodeMetric
ATGCCTAA 12958564 12723883 12845842 12627716 112722 96167 0.12588 0.296903 0.126415 0.296823 0.649996
GAATCTGA 16051217 15756501 15892890 15620810 158327 135691 0.155922 0.367761 0.156545 0.367567 0.804916
AACGTGAT 43645769 42866973 43371041 42649692 274728 217281 0.423977 1 0.425896 1 2.189846
CACTTCGA 14409680 14143684 14271779 14021304 137901 122380 0.139976 0.330151 0.140522 0.329944 0.722526
GCCAAGCA 12606522 12385646 4842 4664 12601680 12380982 0.12246 0.288837 0.123055 0.288932 0.632717
NNNNNNN 3272050 2774658 0 0 0 0 0.031785 0.074968 0.027567 0.064727 0

Hello @fortunatobianconi: Were you able to successfully use IlluminaBasecallsToFastq and convert the FASTQ to SAM? If you are still having issues, you can submit the entire run folder so we can try it here. This link describes the protocol for uploading your folder. https://www.broadinstitute.org/gatk/guide/article?id=1894

@fortunatobianconi: One other thing, I managed to get the tool to work with your library.params file as well. Here it is:

BARCODE_1 OUTPUT SAMPLE_ALIAS LIBRARY_NAME

• PerugiaMember

Thanks for the suggestion. Using IlluminaBasecallsToFastq worked if I do not use the parameter MULTIPLEX_PARAMS. So probably there is a problem in the tab file I'm generating from command line in linux. Could you please send me the library.params file as attached file?

@fortunatobianconi: Happy to do so. However, you can simply copy the data above and paste into a text editor.

• PerugiaMember

Yes, I did copy and paste from you post... but I had the same problem.. Using your file is working. I was using nano command line. I want to be sure that the problem is nano . Thanks for the file..
I have a question. If my barcode is 8B and the run has been performed setting the parameter to 7 B when I extract the barcode is ok if I use as read structure 76T7B75T or I need to force the read structure to 76T7B76T has the parameter of the run (of course changing also library file)?

In my experience, your read structure cannot exceed the total number of reads e.g. if your read structure is 76T7B76T, the total number of reads is 159. The program will crash if your read structure encodes 160 reads or more. However, you can use a total number of reads that is less than 159 without causing a failure.

• PerugiaMember

Ok thanks.When you produce your tab file you were using window or linux? Waht program you were using? Thanks again

Neither. Used a Mac.

To be more specific, I just simply created a text file using TextEdit.app on a Mac.

• Sri LankaMember

Hi,

I ran CheckilluminaDirectory and it was successful. But IlluminaBaseCallsToSam didn't work for me. Is it really necessary to follow this step or would it give me a fair result (even though not the best practice as mentioned in GATK pipeline) when proceed with the BAM file I received by running BWA-MEM ?

Thanks
Sumudu

#### Issue · Github October 2016 by Sheila

Issue Number
1382
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

@Sumudu The main reason why we use the workflow you're referring to, with an unmapped BAM intermediate, is because it offers more robust data management opportunities. The output is qualitatively the same as starting from a FASTQ (which I assume is what you provided to BWA), except that the Picard tools that deal with unmapped BAMs have additional built-in cleanup options that can eliminate some sources of invalid records.

• Sri LankaMember

Thank you for your answer. Just want to clarify whether running CleanSam in Picard would do the same cleaning as can be achieved by running IlluminaBaseCallsToSam and then MergeBamAlignment.

I'm confused with the documentation which says "CleanSam Cleans the provided SAM/BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads". Whereas "MergeBamAlignment Merges alignment data from a SAM or BAM with data in an unmapped BAM file. The purpose of this tool is to use information from the unmapped BAM to fix up aligner output".

Are these two suppose to do the same cleaning?

Thanks
Regards
Sumudu

#### Issue · Github October 2016 by Sheila

Issue Number
1394
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
vdauwera
• Sri LankaMember

Hi,

I managed to create my library_params.txt to work and now IlluminaBasecallsToSam is running though it seems slow. I have 8 samples and my BaseCalls directory has 16 fastq files R1(forward) & R2(reverse) for each sample. Does IlluminaBasecallsToSam take this in to account? Does it create a single uBAM for a sample with R1 & R2 fastq files? Please clarify me this.

Thank you

-

Hi @Sumudu, sorry for the late response. Here is the answer from @Sheila:

CleanSam is not doing the same cleaning as MergeBamAlignment. CleanSam does the exact two things stated in the documentation. IlluminaBaseCallsToSam converts BCL data (not FASTQ) to an unmapped BAM/SAM file. You need to run BWA (or aligner of your choice) on the unmapped data; we do this by running SamToFastq and piping the output to BWA because BWA only accepts FASTQ input. Then, MergeBamAlignment will add the information from the unmapped reads to the mapped BAM file. Have a look at this article for more information on MergeBamAlignment.

I believe the tool will produce a single uBAM per library containing both forward and reverse reads.