I'm attempting to extract barcodes from a set of illumina miseq paired-end sequences. I have a couple of questions about the extractIlluminaBarcodes tool. I've been using this post (http://seqanswers.com/forums/showthread.php?t=19538) as a guide, but have run into a bit of a roadblock.
1) Is the metrics_file, listed as required by the function, input by me, if so what is included in this file, and is there a sample with the appropriate format that I could use as a guide?
2) I'm unclear on the read structure string. The post included above describes a similar experimental setup to what I have, however I'm not sure how this leads to 151T8B151T as listed. I have both forward and reverse barcodes for my sequences, so would it not make more sense to have something like 8B151T151T8B?
3) The post also recommends use of an N barcode in the barcode_file. How does one include this, if indeed this is required?
As I am currently running this command,
java -jar ~/Desktop/software/picard-tools-1.119/ExtractIlluminaBarcodes.jar BASECALLS_DIR=Data/Intensities/BaseCalls LANE=1 BARCODE_FILE=barcode.txt READ_STRUCTURE=151T8B151T METRICS_FILE=metrics NUM_PROCESSORS=4
barcode.txt file is as below:
barcode_sequence_1 barcode_sequence_2 barcode_sequence_3 barcode_sequence_4 barcode_sequence_5 barcode_sequence_6barcode_sequence_7 barcode_sequence_8 barcode_sequence_9 barcode_sequence_10 barcode_sequence_11 barcode_sequence_12barcode_sequence_13 barcode_sequence_14 barcode_sequence_15 barcode_sequence_16 barcode_sequence_17
TACGCTGC ATGCGCAG TAGCGCTC ACTGAGCG CCTAAGAC CGATCAGT TCCTGAGC ATCTCAGG ACTGCATA AAGGAGTA CTAAGCCT CGTCTAAT TCTCTCCG CTCTCTAT TATCCTCT GTAAGGAG
The error I'm getting is:
Exception in thread "main" picard.PicardException: Could not find a format with available files for the following data types: BaseCalls, PF
java version 1.8.0_91