We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Call somatic short variants with mutect - BAM file contigs not matching the reference

Adam_U0Adam_U0 Member
edited September 2018 in Ask the GATK team

Dear GATK Staff,

I read a lot about this problem however it still occurs. I think I did all that I can withouts succes. Here is my run-function based on your tutorial about calling somatic variants:

gatk --java-options "-Xmx2g" Mutect2 \
      -R ucsc.hg19.fasta \
      -I 1_Tumor_sorted_markduplicates_RG.bam \
      -I 1_Blood_markduplicates_RG.bam \
      -tumor 1_tumor \
      -normal 1_normal \
      -pon 1_2_3_threesamplepon_chr.vcf.gz \
      --germline-resource af-only-gnomad.raw.sites.hg19.vcf.gz \
      --af-of-alleles-not-in-resource 0.0000025 \
      --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter \
      -O P129_somatic_m2.vcf.gz \
      -bamout P129_tumor_normal_m2.bam

All reference files I found here:

bioinfo5pilm46.mit.edu/software/GATK/resources/

Unfortunately it's still a problem with chromosome names:

reads contigs = [chr1, chr2, chr3, chr4, chr5....]
reads features = [1,2,3....]

I checked everything, the lengths and names of chromosomes of my .bam files are exactly the same as in the reference.

Lengths:

samtools view -H 1_Tumor_sorted_markduplicates_RG.bam |grep '^@SQ'>chromosomes.txt

same as in here:

http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.chrom.sizes

Chromosome names:

samtools idxstats 1_Tumor_sorted_markduplicates_RG.bam | head -n 3
Each starts with 'chr'.

In the reference, also record in fasta starts with 'chr'.

In case of 1_2_3_threesamplepon_chr.vcf.gz I also checked it, each row after header starts with 'chr'.

af-only-gnomad.raw.sites.hg19.vcf.gz - also each row after header starts with 'chr'...

Everything is fine, but it doesn't work, there's always a problem, always an error, on each step of the analysis - based on your tutorial...

I'm fighting with this since monday during whole days... Please, could you help me?

Best Regards,
Adam

Best Answer

  • Adam_U0Adam_U0
    Accepted Answer

    ANSWER:

    In case of 1_2_3_threesamplepon_chr.vcf.gz I also checked it, each row after header starts with 'chr'.

    Yes because I add 'chr' manually. I created three files for PoN again using mentioned above reference - ucsc.hg19.fa and function worked.

    This question can be removed.

    Best regards,
    Adam

Answers

  • Adam_U0Adam_U0 Member
    Accepted Answer

    ANSWER:

    In case of 1_2_3_threesamplepon_chr.vcf.gz I also checked it, each row after header starts with 'chr'.

    Yes because I add 'chr' manually. I created three files for PoN again using mentioned above reference - ucsc.hg19.fa and function worked.

    This question can be removed.

    Best regards,
    Adam

Sign In or Register to comment.