Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

GATK Staff

ERROR MESSAGE: Unknown file is malformed in DepthOfCoverage

Hi there,

I am using DepthOfCoverage to get the exome coverage of a target reference and getting this error.
ERROR MESSAGE: Unknown file is malformed: Could not parse location from line: chr1 11873 12227 NR_046018_exon_0_0_chr1_11874_f 0 +

My command line is

ava -Xmx64g -jar GenomeAnalysisTK.jar
-T DepthOfCoverage
-I bamfiles.list //A list of paths to bam files
-R ucsc.hg19.fa
-L clinical_exome_cod.bed.interval_list
-geneList:REFSEQ exonTrack.refSeq.chr121XYM.sort
-ct 10 -ct 20 -ct 40 -ct 80 -ct 100
-o bamfiles

I used AWK to modify the refseq file as from the instructions at https://www.broadinstitute.org/gatk/guide/article?id=1329 and this same command was working when I did to get gene coverage.

So, what am I doing wrong now?



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    This seems to be complaining about your gene list. Make sure the values are separated by tabs, not spaces, and use the .refseq extension in the filename.
  • VergiliusVergilius ItalyMember

    I am sure that the extension is .refseq and the values are tab separated. But still I am not able to understand this error.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    I'm a little confused why you used AWK? I ran this tool with no issue simply by following the instructions in the article you linked to above, and I did not need to modify the output file. Also, what do you mean "this same command was working when I did to get gene coverage"?


  • VergiliusVergilius ItalyMember

    Hi Sheila,
    sorry if I have been confusing.

    I followed the instructions in the article and used AWK to implement the request: "To run with the GATK, contigs other than the standard 1-22,X,Y,MT must be removed, and the file sorted in karyotypic order." Hence, I used the following commands to remove non standard chromosomes and sort:

    awk '$3 ~ /^chr[12]?[0-9]$/' geneTrack.refSeq > geneTrack1
    awk '$3 ~ /^chr[XY]$/' geneTrack.refSeq > geneTrackxy
    awk '$3 ~ /^chrMT$/' geneTrack.refSeq > geneTrackM
    cat geneTrack1 geneTrackxy geneTrackM > geneTrackFinal
    sort -k3,3V -k5,5n -k6,6n geneTrackFinal > geneTrack1xy_sort.txt

    When I say that "this same command was working ....", I meant that since I wanted to get both exome and gene coverage I first downloaded geneTrack and got it with the previous procedure and then I downloaded exonTrack but I got the error.

Sign In or Register to comment.