We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

#CHROM characters allowed

AmandaAmanda North CarolinaMember

Hello,

I am trying to use GENOTYPE_GIVEN_ALLELES through HaplotypeCaller and have never run into this problem before. Everything is running as I expect, however I am really not getting any results even when it has parced through the alleles file past a variant position when it would typically be writing to a file as it goes. I am suspecting perhaps this is a problem within the chromosome names as I'm dealing with a file that both ";" and "=" were initially included in contig names by a collaborator. I changed out the = to an underscore because of issues caused previously, but now I am suspecting that perhaps the ";" semicolon is causing problems in GATK even though in VCF4.2 format it appears any character asides from a ":" colon and whitespace should be okay. Can someone comment on this?

I am out of other thoughts rather than having to go back through reference, vcf files and bam files/indexes to fix this character problem.

Thank you!

Best Answer

  • AmandaAmanda North Carolina
    Accepted Answer

    Well Sheila, I have no idea what has changed, but it appears that it is working now. Perhaps the admins have changed something on our server that was causing the issue. Thank you!!

Answers

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    @Amanda
    Hi,

    If the files are in concordance with VCF spec, there should be no issue. Can you post the exact command you are running and validate your input VCFs with ValidateVariants? Are you using version 3.7?

    Thanks,
    Sheila

  • AmandaAmanda North CarolinaMember
    edited July 2017

    Hi Sheila,

    It appears that it is not recognizing any variants in my file using ValidateVariants - as it is reporting

    Successfully validated the input file.  Checked 0 records with no failures.
    INFO  14:06:44,904 ProgressMeter -            done     32393.0     3.0 s     112.0 s       23.0%    13.0 s      10.0 s
    INFO  14:06:44,906 ProgressMeter - Total runtime 3.65 secs, 0.06 min, 0.00 hours
    ------------------------------------------------------------------------------------------
    Done. There were no warn messages.
    ------------------------------------------------------------------------------------------
    

    Although it appears to me that there is no issue with my VCF file as it will work without a problem with vcftools. Here is the head of the file.

    ##fileformat=VCFv4.2
    #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  S1   S2
    Scaffold_1;HRSCAF_1     281     UCD_000001      A       G       .       .      .GT      1/1     0/0
    Scaffold_1;HRSCAF_1     4037    UCD_000004      C       A       .       .      .GT      1/1     0/0
    Scaffold_2;HRSCAF_2     9254    UCD_000011      A       G       .       .      .GT      1/1     0/0
    Scaffold_5;HRSCAF_5     3368    UCD_000015      C       T       .       .      .GT      1/1     0/0
    

    (Yes there is a tab between INFO and FORMAT column)

    Yes, I am using version 3.7.

    Best,
    Amanda

  • AmandaAmanda North CarolinaMember
    edited July 2017

    You can see there is no issue with it to begin running, but nothing is being written to the file even though it has already passed the markers here in the head of the file.

    INFO  15:19:48,217 HelpFormatter - Date/Time: 2017/07/05 15:19:48
    INFO  15:19:48,218 HelpFormatter - ---------------------------------------------
    ------------------------------------
    INFO  15:19:48,220 HelpFormatter - ---------------------------------------------
    ------------------------------------
    INFO  15:19:48,872 GenomeAnalysisEngine - Strictness is SILENT
    INFO  15:19:49,277 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMP
    LE, Target Coverage: 500
    INFO  15:19:49,307 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO  15:19:53,574 SAMDataSource$SAMReaders - Init 50 BAMs in last 4.23 s, 50 of
     102 in 4.26 s / 0.07 m (11.73 tasks/s).  52 remaining with est. completion in 4
    .43 s / 0.07 m
    INFO  15:19:58,776 SAMDataSource$SAMReaders - Init 50 BAMs in last 5.18 s, 100 o
    f 102 in 9.47 s / 0.16 m (10.56 tasks/s).  2 remaining with est. completion in 0
    .19 s / 0.00 m
    INFO  15:19:59,130 SAMDataSource$SAMReaders - Done initializing BAM readers: tot
    al time 9.82
    INFO  15:19:59,427 HCMappingQualityFilter - Filtering out reads with MAPQ < 20
    WARN  15:19:59,474 RMDTrackBuilder - Index file finalfixed_xaa.vcf.i
    dx is out of date (index older than input file), deleting and updating the index
     file
    INFO  15:20:00,287 RMDTrackBuilder - Writing Tribble index to disk for file finalfixed_xaa.vcf.idx
    INFO  15:37:54,866 MicroScheduler - Running the GATK in parallel mode with 12 to
    tal threads, 12 CPU thread(s) for each of 1 data thread(s), of 48 processors ava
    ilable on this machine
    INFO  15:37:54,980 GenomeAnalysisEngine - Preparing for traversal over 102 BAM f
    iles
    INFO  15:37:55,605 GenomeAnalysisEngine - Done preparing for traversal
    INFO  15:37:55,944 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING
    ]
    INFO  15:37:55,947 ProgressMeter -                 |      processed |    time |
            per 1M |           |   total | remaining
    INFO  15:37:55,950 ProgressMeter -        Location | active regions | elapsed |
    active regions | completed | runtime |   runtime
    INFO  15:37:55,953 HaplotypeCaller - Disabling physical phasing, which is suppor
    ted only for reference-model confidence output
    INFO  15:37:56,157 StrandBiasTest - SAM/BAM data was found. Attempting to use re
    ad data to calculate strand bias annotations values.
    INFO  15:37:56,158 StrandBiasTest - SAM/BAM data was found. Attempting to use re
    ad data to calculate strand bias annotations values.
    INFO  15:37:56,224 HaplotypeCaller - Using global mismapping rate of 45 => -4.5
    in log10 likelihood units
    INFO  15:37:56,230 PairHMM - Performance profiling for PairHMM is disabled becau
    se HaplotypeCaller is being run with multiple threads (-nct>1) option
    Profiling is enabled only when running in single thread mode
    
    Using AVX accelerated implementation of PairHMM
    INFO  15:37:58,730 VectorLoglessPairHMM - libVectorLoglessPairHMM unpacked succe
    ssfully from GATK jar file
    INFO  15:37:58,733 VectorLoglessPairHMM - Using vectorized implementation of Pai
    rHMM
    WARN  15:38:05,313 HaplotypeScore - Annotation will not be calculated, must be c
    alled from UnifiedGenotyper
    WARN  15:38:05,316 InbreedingCoeff - Annotation will not be calculated, must pro
    vide a valid PED file (-ped) from the command line.
    WARN  15:38:05,324 AnnotationUtils - Annotation will not be calculated, genotype
     is not called
    INFO  15:38:25,959 ProgressMeter - Scaffold_5;HRSCAF_5:15359         102611.0
     30.0 s            4.9 m        0.0%    84.3 h      84.3 h
    INFO  15:39:25,967 ProgressMeter - Scaffold_8;HRSCAF_8:10619         173932.0
     90.0 s            8.6 m        0.0%     6.7 d       6.7 d
    INFO  15:40:25,976 ProgressMeter - Scaffold_8;HRSCAF_8:10619         173932.0
      2.5 m           14.4 m        0.0%    11.2 d      11.2 d
    INFO  15:41:25,990 ProgressMeter - Scaffold_8;HRSCAF_8:10619         173932.0
      3.5 m           20.1 m        0.0%    15.7 d      15.7 d
    INFO  15:42:25,998 ProgressMeter - Scaffold_8;HRSCAF_8:10619         173932.0
      4.5 m           25.9 m        0.0%     2.9 w       2.9 w
    INFO  15:43:26,002 ProgressMeter - Scaffold_8;HRSCAF_8:10619         173932.0
      5.5 m           31.6 m        0.0%     3.5 w       3.5 w
    INFO  15:44:26,012 ProgressMeter - Scaffold_8;HRSCAF_8:10619         173932.0
      6.5 m           37.4 m        0.0%     4.2 w       4.2 w
    
  • AmandaAmanda North CarolinaMember
    Accepted Answer

    Well Sheila, I have no idea what has changed, but it appears that it is working now. Perhaps the admins have changed something on our server that was causing the issue. Thank you!!

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭
    edited July 2017

    @Amanda
    Hi Amanda,

    Glad to hear it is working now!

    -Sheila

Sign In or Register to comment.