Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

Regards
GATK Staff

extractIlluminaBarcodes

dreadrea CanadaMember

I'm attempting to extract barcodes from a set of illumina miseq paired-end sequences. I have a couple of questions about the extractIlluminaBarcodes tool. I've been using this post (http://seqanswers.com/forums/showthread.php?t=19538) as a guide, but have run into a bit of a roadblock.

1) Is the metrics_file, listed as required by the function, input by me, if so what is included in this file, and is there a sample with the appropriate format that I could use as a guide?
2) I'm unclear on the read structure string. The post included above describes a similar experimental setup to what I have, however I'm not sure how this leads to 151T8B151T as listed. I have both forward and reverse barcodes for my sequences, so would it not make more sense to have something like 8B151T151T8B?
3) The post also recommends use of an N barcode in the barcode_file. How does one include this, if indeed this is required?

As I am currently running this command,
java -jar ~/Desktop/software/picard-tools-1.119/ExtractIlluminaBarcodes.jar BASECALLS_DIR=Data/Intensities/BaseCalls LANE=1 BARCODE_FILE=barcode.txt READ_STRUCTURE=151T8B151T METRICS_FILE=metrics NUM_PROCESSORS=4

barcode.txt file is as below:
barcode_sequence_1 barcode_sequence_2 barcode_sequence_3 barcode_sequence_4 barcode_sequence_5 barcode_sequence_6barcode_sequence_7 barcode_sequence_8 barcode_sequence_9 barcode_sequence_10 barcode_sequence_11 barcode_sequence_12barcode_sequence_13 barcode_sequence_14 barcode_sequence_15 barcode_sequence_16 barcode_sequence_17
TACGCTGC ATGCGCAG TAGCGCTC ACTGAGCG CCTAAGAC CGATCAGT TCCTGAGC ATCTCAGG ACTGCATA AAGGAGTA CTAAGCCT CGTCTAAT TCTCTCCG CTCTCTAT TATCCTCT GTAAGGAG

The error I'm getting is:
Exception in thread "main" picard.PicardException: Could not find a format with available files for the following data types: BaseCalls, PF
at picard.illumina.parser.IlluminaDataProviderFactory.(IlluminaDataProviderFactory.java:172)
at picard.illumina.parser.IlluminaDataProviderFactory.(IlluminaDataProviderFactory.java:127)
at picard.illumina.ExtractIlluminaBarcodes.customCommandLineValidation(ExtractIlluminaBarcodes.java:332)
at picard.cmdline.CommandLineProgram.parseArgs(CommandLineProgram.java:242)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:128)
at picard.illumina.ExtractIlluminaBarcodes.main(ExtractIlluminaBarcodes.java:357)

java version 1.8.0_91

Sign In or Register to comment.