Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Potential bug(?) using Picard's sortvcf

mattqdeanmattqdean CAMember Posts: 10
edited February 2016 in Ask the GATK team

I tried sorting my COSMIC vcf using Picard's SortVcf function to match my reference dictionary order. This was after an error using MuTect2 and visiting a provided link.

java -jar -Xmx32g /cm/shared/apps/picard/1.127/picard.jar SortVcf I=/work/gencode/CosmicCodingMuts.vcf O=/work/gencode/CosmicCodingMuts%.vcf SEQUENCE_DICTIONARY=/work/gencode/GRCh38.p5.genome.dict

Reference dictionary:
@HD VN:1.4 SO:unsorted
@SQ SN:chr1 LN:248956422 M5:2648ae1bacce4ec4b6cf337dcae37816 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr2 LN:242193529 M5:4bb4f82880a14111eb7327169ffb729b UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr3 LN:198295559 M5:a48af509898d3736ba95dc0912c0b461 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr4 LN:190214555 M5:3210fecf1eb92d5489da4346b3fddc6e UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr5 LN:181538259 M5:f7f05fb7ceea78cbc32ce652c540ff2d UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr6 LN:170805979 M5:6a48dfa97e854e3c6f186c8ff973f7dd UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr7 LN:159345973 M5:94eef2b96fd5a7c8db162c8c74378039 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr8 LN:145138636 M5:c67955b5f7815a9a1edfaa15893d3616 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr9 LN:138394717 M5:addd2795560986b7491c40b1faa3978a UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr10 LN:133797422 M5:907112d17fcb73bcab1ed1c72b97ce68 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr11 LN:135086622 M5:1511375dc2dd1b633af8cf439ae90cec UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr12 LN:133275309 M5:e81e16d3f44337034695a29b97708fce UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr13 LN:114364328 M5:17dab79b963ccd8e7377cef59a54fe1c UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr14 LN:107043718 M5:acbd9552c059d9b403e75ed26c1ce5bc UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr15 LN:101991189 M5:f036bd11158407596ca6bf3581454706 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr16 LN:90338345 M5:24e7cabfba3548a2bb4dff582b9ee870 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr17 LN:83257441 M5:a8499ca51d6fb77332c2d242923994eb UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr18 LN:80373285 M5:11eeaa801f6b0e2e36a1138616b8ee9a UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr19 LN:58617616 M5:b0eba2c7bb5c953d1e06a508b5e487de UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr20 LN:64444167 M5:b18e6c531b0bd70e949a7fc20859cb01 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr21 LN:46709983 M5:2f45a3455007b7e271509161e52954a9 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chr22 LN:50818468 M5:221733a2a15e2de66d33e73d126c5109 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chrX LN:156040895 M5:49527016a48497d9d1cbd8e4a9049bd3 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chrY LN:57227415 M5:b2b7e6369564d89059e763cd6e736837 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:chrM LN:16569 M5:c68f52674c9fb33aef52dcf399755519 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:GL000008.2 LN:209709 M5:a999388c587908f80406444cebe80ba3 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:GL000009.2 LN:201709 M5:862f555045546733591ff7ab15bcecbe UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:GL000194.1 LN:191469 M5:6ac8f815bf8e845bb3031b73f812c012 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:GL000195.1 LN:182896 M5:5d9ec007868d517e73543b005ba48535 UR:file:/work/gencode/GRCh38.p5.genome.fa
@SQ SN:GL000205.2 LN:185591 M5:458e71cd53dd1df4083dc7983a6c82c4 UR:file:/work/gencode/GRCh38.p5.genome.fa

Interestingly after sorting my COSMIC vcf, all of my known contigs moves BEHIND all the unplaced contigs
##### ERROR cosmic contigs = [GL000008.2, GL000009.2, GL000194.1, GL000195.1, GL000205.2, GL000208.1, GL000209.2, GL000213.1, GL000214.1, GL000216.2, GL000218.1, GL000219.1, GL000220.1, GL000221.1, GL000224.1, GL000225.1, GL000226.1, GL000250.2, GL000251.2, GL000252.2, ... ... ... ... chr1, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr2, chr20, chr21, chr22, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chrM, chrX, chrY]

ERROR reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM, GL000008.2, GL000009.2, GL000194.1, GL000195.1, GL000205.2,... ... ...

Is this possibly because my dictionary is unsorted?

