Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ERROR MESSAGE: Unknown file is malformed in DepthOfCoverage

Hi there,

I am using DepthOfCoverage to get the exome coverage of a target reference and getting this error.
ERROR MESSAGE: Unknown file is malformed: Could not parse location from line: chr1 11873 12227 NR_046018_exon_0_0_chr1_11874_f 0 +

My command line is

ava -Xmx64g -jar GenomeAnalysisTK.jar
-T DepthOfCoverage
-I bamfiles.list //A list of paths to bam files
-R ucsc.hg19.fa
-L clinical_exome_cod.bed.interval_list
-geneList:REFSEQ exonTrack.refSeq.chr121XYM.sort
-ct 10 -ct 20 -ct 40 -ct 80 -ct 100
-o bamfiles

I used AWK to modify the refseq file as from the instructions at https://www.broadinstitute.org/gatk/guide/article?id=1329 and this same command was working when I did to get gene coverage.

So, what am I doing wrong now?

Thanks

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    This seems to be complaining about your gene list. Make sure the values are separated by tabs, not spaces, and use the .refseq extension in the filename.
  • VergiliusVergilius ItalyMember

    I am sure that the extension is .refseq and the values are tab separated. But still I am not able to understand this error.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @Vergilius
    Hi,

    I'm a little confused why you used AWK? I ran this tool with no issue simply by following the instructions in the article you linked to above, and I did not need to modify the output file. Also, what do you mean "this same command was working when I did to get gene coverage"?

    -Sheila

  • VergiliusVergilius ItalyMember

    Hi Sheila,
    sorry if I have been confusing.

    I followed the instructions in the article and used AWK to implement the request: "To run with the GATK, contigs other than the standard 1-22,X,Y,MT must be removed, and the file sorted in karyotypic order." Hence, I used the following commands to remove non standard chromosomes and sort:

    awk '$3 ~ /^chr[12]?[0-9]$/' geneTrack.refSeq > geneTrack1
    awk '$3 ~ /^chr[XY]$/' geneTrack.refSeq > geneTrackxy
    awk '$3 ~ /^chrMT$/' geneTrack.refSeq > geneTrackM
    cat geneTrack1 geneTrackxy geneTrackM > geneTrackFinal
    sort -k3,3V -k5,5n -k6,6n geneTrackFinal > geneTrack1xy_sort.txt

    When I say that "this same command was working ....", I meant that since I wanted to get both exome and gene coverage I first downloaded geneTrack and got it with the previous procedure and then I downloaded exonTrack but I got the error.

Sign In or Register to comment.