SVAltAligner walker

Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin
edited September 2012 in GenomeSTRiP Documentation

1. Introduction

The SVAltAligner walker traverses a set of BAM files to compute alignments to the alternate alleles of structural variations. This walker is one component of the SVAltAlign pipeline.

2. Inputs / Arguments

  • -I <bam-file> : The set of input BAM files containing records to realign.

  • -altReference <fasta-file> : The fasta file for the alternate allele reference sequences. The fasta file must be indexed with 'samtools faidx' or the equivalent. This file should be the output from GenerateAltAlleleFasta.

  • -alignMappedReads : If present, then align all reads in the input BAM files, not just unmapped reads (default false).

  • -alignUnmappedMates <true/false> : If true (the default), then align unmapped mates of mapped reads (i.e. reads that have a reference position but have the unmapped flag set). If set to false, then only reads in the "unmapped" portion of the BAM file will be aligned.

  • -md <directory> : The metadata directory containing metadata about the input data set.

3. Outputs

  • -O <bam-file> : The output from this walker is a BAM file containing new alignments for input reads that align to the alternate allele reference sequences. If no output file is specified, the output is in SAM format instead and is written to standard output.
Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD

Comments

  • SiyangLiuSiyangLiu Posts: 11Member

    Hi, I want to align the short reads in bam towards a set of alternative contigs by assembly which contains a set of variants. First I try SVAltAligner walker but it seems that the class path of this walker is not specified in the relevant pages and thus I don't have the clue how to run it. Then I try the pipeline in GenomeStrip package (svtoolkit_1.04.1228.tar.gz) /com/extra/Java/7u5/bin/java -cp ${classpath} ${mx} \ org.broadinstitute.sting.queue.QCommandLine \ -S $toolsDir/qscript/SVAltAlign.q \ -S $toolsDir/qscript/SVQScript.q \ -gatk $toolsDir/lib/gatk/GenomeAnalysisTK.jar \ -cp $toolsDir/lib/SVToolkit.jar:$toolsDir/lib/gatk/GenomeAnalysisTK.jar \ -configFile $toolsDir/conf/genstrip_parameters.txt \ -tempDir $out/tmpdir \ -md $out/metadata \ -I $bam \ -alignUnmappedMates \ -vcf $vcf \ -R $ref \ -O $out/Simu.20.svRealign.bam \ -run && echo done First the Simu.20.realign.alt.fasta will be generated with a title like this ">._1 L:20:13823-14022:1-200|R:20:14038-14237:342-541|LENGTH:541" ">._1 L:20:5655407-5655623:1-217|R:20:5655607-5655823:342-558|LENGTH:558"

    Then I met this problem. net.sf.picard.PicardException: Sequence name appears more than once in reference: ._1

    I think the space between ">._1" and "L:20..." is the reason for the error. Could you please tell me how to run SVAltAligner walker and help me have a look at the Picardexception problem?

  • bhandsakerbhandsaker Posts: 140Member, Third-party Developer ✭✭✭

    Many of the tools and utilities in svtoolkit require that every variant has a unique ID field. I suspect your variants don't have IDs (i.e. they are all ".").

    Bob Handsaker, Broad Institute / Harvard Medical School Dept of Genetics

Sign In or Register to comment.