If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
I'm attempting to extract barcodes from a set of illumina miseq paired-end sequences. I have a couple of questions about the extractIlluminaBarcodes tool. I've been using this post (http://seqanswers.com/forums/showthread.php?t=19538) as a guide, but have run into a bit of a roadblock.
1) Is the metrics_file, listed as required by the function, input by me, if so what is included in this file, and is there a sample with the appropriate format that I could use as a guide?
2) I'm unclear on the read structure string. The post included above describes a similar experimental setup to what I have, however I'm not sure how this leads to 151T8B151T as listed. I have both forward and reverse barcodes for my sequences, so would it not make more sense to have something like 8B151T151T8B?
3) The post also recommends use of an N barcode in the barcode_file. How does one include this, if indeed this is required?
As I am currently running this command,
java -jar ~/Desktop/software/picard-tools-1.119/ExtractIlluminaBarcodes.jar BASECALLS_DIR=Data/Intensities/BaseCalls LANE=1 BARCODE_FILE=barcode.txt READ_STRUCTURE=151T8B151T METRICS_FILE=metrics NUM_PROCESSORS=4
barcode.txt file is as below:
barcode_sequence_1 barcode_sequence_2 barcode_sequence_3 barcode_sequence_4 barcode_sequence_5 barcode_sequence_6barcode_sequence_7 barcode_sequence_8 barcode_sequence_9 barcode_sequence_10 barcode_sequence_11 barcode_sequence_12barcode_sequence_13 barcode_sequence_14 barcode_sequence_15 barcode_sequence_16 barcode_sequence_17
TACGCTGC ATGCGCAG TAGCGCTC ACTGAGCG CCTAAGAC CGATCAGT TCCTGAGC ATCTCAGG ACTGCATA AAGGAGTA CTAAGCCT CGTCTAAT TCTCTCCG CTCTCTAT TATCCTCT GTAAGGAG
The error I'm getting is:
Exception in thread "main" picard.PicardException: Could not find a format with available files for the following data types: BaseCalls, PF
java version 1.8.0_91