We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

can ValidateSamFile check on CRAM files?

JaviJavi Cambridge - UKMember
edited July 2016 in Ask the GATK team

I'm trying to validate a set of cram files, however, I'm getting the error shown below. I'm not sure if it is because I'm using a CRAM file, I have problems with the htsjdk library or the cram files are corrupted

[Tue Jul 05 16:49:12 BST 2016] picard.sam.ValidateSamFile INPUT=19743.cramFiles/19743_1.1.cram MODE=SUMMARY MAX_OUTPUT=100 IGNORE_WARNINGS=false VALIDATE_INDEX=true INDEX_VALIDATION_STRINGENCY=EXHAUSTIVE IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Jul 05 16:49:12 BST 2016] Executing as jga on Linux 3.2.0-75-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_74-b02; Picard version: 2.4.1(7c4d36e011df1aec4689b51efcada44e92d1817f_1464389670) JdkDeflater
[Tue Jul 05 16:49:13 BST 2016] picard.sam.ValidateSamFile done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=759693312
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.cram.CRAMException: Contig chr1 not found in the reference file.
at htsjdk.samtools.CRAMIterator.nextContainer(CRAMIterator.java:171)
at htsjdk.samtools.CRAMIterator.hasNext(CRAMIterator.java:257)
at htsjdk.samtools.SamReader$AssertingIterator.hasNext(SamReader.java:568)
at htsjdk.samtools.SamFileValidator.validateSamRecordsAndQualityFormat(SamFileValidator.java:268)
at htsjdk.samtools.SamFileValidator.validateSamFile(SamFileValidator.java:200)
at htsjdk.samtools.SamFileValidator.validateSamFileSummary(SamFileValidator.java:128)
at picard.sam.ValidateSamFile.doWork(ValidateSamFile.java:187)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

Any help will be much appreciated.
Thanks!

Answers

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @Javi
    Hi,

    Can you try providing a reference in your command?

    -Sheila

  • zek12zek12 LondonMember
    edited January 2017

    Hi there,

    did anybody manage to solve this error? I'm getting the same error. In my case I have bam files which I want to convert to CRAM and then clean them using Picardtools CleanSam. I couldn't manage to sort them though, so I converted them directly. These are the commands:

    samtools view -T ref.fasta -C -o sample.cram sample.bam
    java -jar picard.jar CleanSam \
    I=sample.cram \
    O=sample.clean.cram

    I am using Samtools 1.3 and Picardtools 2.0.1

    The error I am getting is:

    Exception in thread "main" htsjdk.samtools.cram.CRAMException: Contig 1 not found in the reference file.
    at htsjdk.samtools.CRAMIterator.nextContainer(CRAMIterator.java:176)
    at htsjdk.samtools.CRAMIterator.hasNext(CRAMIterator.java:263)
    at htsjdk.samtools.SamReader$AssertingIterator.hasNext(SamReader.java:568)
    at picard.sam.CleanSam.doWork(CleanSam.java:87)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

    Many thanks.

    Post edited by zek12 on

    Issue · Github
    by Sheila

    Issue Number
    1664
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @zek12,

    CRAM support has been evolving from rocky beginnings to... slightly less rocky but still pretty bumpy middlings. I would recommend upgrading to the latest version of Picard, for starters, to make sure you take advantage of the work that has been happening there over the past year or so. Also, try providing a reference file. If that still doesn't work, let us know.

  • zek12zek12 LondonMember

    Ok @Geraldine_VdAuwera , thank you very much!

  • elcinchu27elcinchu27 BroadMember, Broadie
    edited March 2017

    I have the same problem with my conversion from CRAM to FASTQ files. The Picard version that I used is the last one 2.9.0 and I provide the reference sequence too.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @elcinchu27 At this time our best recommendation is, if you encounter a problem running on a CRAM, to convert it to BAM before you do anything else.

  • elcinchu27elcinchu27 BroadMember, Broadie

    Ok, I could do that with Samtools I think. But the problem is that I have the files in the cloud and the size of a whole-genome BAM is really large, so it could be really interesting that in the future this problem could be resolved.

    Thank you for the help,

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @elcinchu27,

    I recently used WDL and Pipelines API to convert CRAMs to BAMs in the cloud. I can share my WDL script with you if you want. The commands I use are

    samtools view -h -T ${ref_fasta} -b ${cram} -o ${basename}.bam
    samtools index ${basename}.bam ${basename}.bai
    

    Be sure you are using Samtools v1.3.1+.

  • elcinchu27elcinchu27 BroadMember, Broadie

    @shlee great! Thanks a lot.

Sign In or Register to comment.