If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Exome seq analysis: MESSAGE: I/O error loading or writing tribble index file IndiaMember
edited November 2015 in Ask the GATK team

Dear all,

I am trying to analyse a publicly available human exome seq dataset. I am stuck at the stage to "create target list of intervals to be realigned". I used the "human_g1k_v37.fasta" avaibale from GATK bundle as the reference in BWA mapping. After creating the BAM files, to create a VCF file, I want to follow the GATK best practice. Here is what I did:

$ ~/BWA/bwa-0.7.12/bwa aln -t 12 ~/Reference/HG_37/human_g1k_v37.fasta ~/BWA/bwa-0.7.12/ERR445410_1.fastq > ERR445410_1.sai

$ ~/BWA/bwa-0.7.12/bwa aln -t 12 ~/Reference/HG_37/human_g1k_v37.fasta ~/BWA/bwa-0.7.12/ERR445410_2.fastq > ERR445410_2.sai

$ ~/BWA/bwa-0.7.12/bwa sampe ~/Reference/HG_37/human_g1k_v37.fasta ~/BWA/bwa-0.7.12/ERR445410_1.sai ~/BWA/bwa-0.7.12/ERR445410_2.sai ~/BWA/bwa-0.7.12/ERR445410_1.fastq ~/BWA/bwa-0.7.12/ERR445410_2.fastq > ERR445410.sam

samtools and remove all unmapped reads:
samtools view -bF 4 ERR445410.bam > ERR445410_unMpdfltr.bam

Sort BAM file to coordinate order
$ java -jar picard.jar SortSam INPUT=ERR445410_unMpdFltr.bam OUTPUT=ERR445410_cord_srt.bam SORT_ORDER=coordinate

Picard command to mark duplicates:
$ java -jar picard.jar MarkDuplicates INPUT=ERR445410_cord_srt.bam OUTPUT=ERR445410_dedup_reads.bam METRICS_FILE=metrics.txt

Picard command to index the BAM file:
$ java -jar picard.jar BuildBamIndex INPUT=ERR445410_dedup_reads.bam

Change the read group header in picard:
java -jar picard.jar AddOrReplaceReadGroups \
INPUT=ERR445410_dedup_reads.bam \
OUTPUT=ERR445410_dedup_reads_RG.bam \
RGLB=whatever \
RGPL=illumina \
RGPU=whatever \

Picard command to re-index the BAM file:
$ java -jar picard.jar BuildBamIndex INPUT=ERR445410_dedup_reads_RG.bam

Create a target list of intervals to be realigned:
java -jar GenomeAnalysisTK.jar \
-T RealignerTargetCreator \
-R GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
-I ERR445410_dedup_reads.bam \
-known Mills_and_1000G_gold_standard.indels.b37.vcf \
-o realignment_targets.list

When I entered the above mentioned GATK command, I get the following error message:

ERROR MESSAGE: I/O error loading or writing tribble index file

I am using CYGWIN for running all the programs. My OS is WIn10. I request someone to kindly help me trouble shoot.

Thanks in advance.

G. Arun Kumar

Best Answer


Sign In or Register to comment.