Error when trying to fix the contigs order in the reference and vcf for FastaAlternateReferenceMaker

I am trying to use a VCF containing snps variants to change the mouse reference (GRCm38- c57BL/6J) with BALB/cJ snps.

After running this command:

java -jar ~/programs/GenomeAnalysisTK.jar -T FastaAlternateReferenceMaker -R ~/genome/mouse_GRCm38.p4/GRCm38.primary_assembly/GRCm38.primary_assembly.fa -o ~/BALBcJ.snp.primary.fa -V ~/BALB_cJ.snps.vcf

The following ERROR shows up:

ERROR MESSAGE: Input files /home/tiagocastro/BALB_cJ.snps.vcf and reference have incompatible contigs: The contig order in /home/tiagocastro/BALB_cJ.snps.vcf and referenceis not the same; to fix this please see: (https://www.broadinstitute.org/gatk/guide/article?id=1328), which describes reordering contigs in BAM and VCF files..
ERROR /home/tiagocastro/BALB_cJ.snps.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, X, Y]
ERROR reference contigs = [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, MT, X, Y, JH584299.1, GL456233.1, JH584301.1, GL456211.1, GL456350.1, JH584293.1, GL456221.1, JH584297.1, JH584296.1, GL456354.1, JH584294.1, JH584298.1, JH584300.1, GL456219.1, GL456210.1, JH584303.1, JH584302.1, GL456212.1, JH584304.1, GL456379.1, GL456216.1, GL456393.1, GL456366.1, GL456367.1, GL456239.1, GL456213.1, GL456383.1, GL456385.1, GL456360.1, GL456378.1, GL456389.1, GL456372.1, GL456370.1, GL456381.1, GL456387.1, GL456390.1, GL456394.1, GL456392.1, GL456382.1, GL456359.1, GL456396.1, GL456368.1, JH584292.1, JH584295.1]

So Trying to fix, I used the perl script in the link to sort properly within the reference.

I did this:

./sortByRef.pl ~/BALB_cJ.snps.vcf /home/tiagocastro/genome/mouse_GRCm38.p4/GRCm38.primary_assembly/GRCm38.primary_assembly.fa.fai > ~/BALB_cJ.snps_sorted.vcf

using the new vcf file, a new error is shown:

ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file '/home/tiagocastro/BALB_cJ.snps_sorted.vcf' could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:
ERROR Name FeatureType Documentation
ERROR BCF2 VariantContext (this is an external codec and is not documented within GATK)
ERROR VCF VariantContext (this is an external codec and is not documented within GATK)
ERROR VCF3 VariantContext (this is an external codec and is not documented within GATK)

looking the head of each, sorted and basic vcf, I can see that is different.

Can someone help me?

Answers

  • tiago211287tiago211287 BrazilMember
    edited August 2015

    I fixed the problem.

    I coppied the header to the sorted file and it solved all errors.

    For copping I used this little bash command:

    { head -n69 original.vcf; cat sorted.vcf; } >tmp$$ && mv tmp$$ sorted.vcf

Sign In or Register to comment.