To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

extractIlluminaBarcodes

dreadrea CanadaMember

I'm attempting to extract barcodes from a set of illumina miseq paired-end sequences. I have a couple of questions about the extractIlluminaBarcodes tool. I've been using this post (http://seqanswers.com/forums/showthread.php?t=19538) as a guide, but have run into a bit of a roadblock.

1) Is the metrics_file, listed as required by the function, input by me, if so what is included in this file, and is there a sample with the appropriate format that I could use as a guide?
2) I'm unclear on the read structure string. The post included above describes a similar experimental setup to what I have, however I'm not sure how this leads to 151T8B151T as listed. I have both forward and reverse barcodes for my sequences, so would it not make more sense to have something like 8B151T151T8B?
3) The post also recommends use of an N barcode in the barcode_file. How does one include this, if indeed this is required?

As I am currently running this command,
java -jar ~/Desktop/software/picard-tools-1.119/ExtractIlluminaBarcodes.jar BASECALLS_DIR=Data/Intensities/BaseCalls LANE=1 BARCODE_FILE=barcode.txt READ_STRUCTURE=151T8B151T METRICS_FILE=metrics NUM_PROCESSORS=4

barcode.txt file is as below:
barcode_sequence_1 barcode_sequence_2 barcode_sequence_3 barcode_sequence_4 barcode_sequence_5 barcode_sequence_6barcode_sequence_7 barcode_sequence_8 barcode_sequence_9 barcode_sequence_10 barcode_sequence_11 barcode_sequence_12barcode_sequence_13 barcode_sequence_14 barcode_sequence_15 barcode_sequence_16 barcode_sequence_17
TACGCTGC ATGCGCAG TAGCGCTC ACTGAGCG CCTAAGAC CGATCAGT TCCTGAGC ATCTCAGG ACTGCATA AAGGAGTA CTAAGCCT CGTCTAAT TCTCTCCG CTCTCTAT TATCCTCT GTAAGGAG

The error I'm getting is:
Exception in thread "main" picard.PicardException: Could not find a format with available files for the following data types: BaseCalls, PF
at picard.illumina.parser.IlluminaDataProviderFactory.(IlluminaDataProviderFactory.java:172)
at picard.illumina.parser.IlluminaDataProviderFactory.(IlluminaDataProviderFactory.java:127)
at picard.illumina.ExtractIlluminaBarcodes.customCommandLineValidation(ExtractIlluminaBarcodes.java:332)
at picard.cmdline.CommandLineProgram.parseArgs(CommandLineProgram.java:242)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:128)
at picard.illumina.ExtractIlluminaBarcodes.main(ExtractIlluminaBarcodes.java:357)

java version 1.8.0_91

Sign In or Register to comment.